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Preface 



Our ideas of the nature of the primordial universe have varied with time, 
closely following our understanding of physics. With Einstein and Hubble in 
the 1920s we learned from General Relativity that observing distant objects 
in the expanding universe was the key to unravelling our past. Primordial 
at that time meant the study of the past of a universe apparently younger 
than the Solar System, at an age of 1 billion years. This contradiction was 
only resolved in the 1950s as the modern distance scale became accepted. 
Gamow in the 1950s also understood the role of Nuclear Physics in the 
synthesis of the light elements during what we call now the Big Bang. Al- 
though it took a decade, until the discovery of the Cosmic Microwave Back- 
ground Radiation, for Gamow’s pioneering work to be recognised, these 
ideas provided theoretical as well as observational access to the primordial 
universe when it was only 3 min old, a big jump, undoubtedly. Sakharov 
in the 1960s told us why the universe was made of baryons despite the 
physics predictions of the existence of particles and antiparticles with similar 
properties. 

The next step was to understand, thanks to Kirshnitz and Linde, that 
the particle physics motivated unification of the weak and electromagnetic 
interactions (now well established) implied a phase transition in the early 
universe, when it was barely 10”^° s old, with, at earlier times, much simpler 
laws of physics that were fundamentally different from those that presently 
hold. Also, Guth conjectured in the early 1980s that there was another ma- 
jor phase change when the strong interactions unified with the electroweak 
interactions, at an age of 10^^** s. The triggering of inflation explains the size 
of our present universe, which is a factor 10^*^ larger than the microphysics 
scale. Before this era, all particles would be massless and all interactions 
(but gravity) the same, within a perfectly symmetric universe! At these 
scales, however, the predictions of particle physics are far from being con- 
firmed by accelerator experiments: so it became customary for cosmologists 
to invent for convenience their own laws of Physics, often differing from 
those particle physicists were devising separately! 

Another major problem of Physics and Cosmology is the composition 
of the Universe. We know, as Zwicky discovered from the dynamics of 
the Coma cluster nearly 60 years ago, that the luminous matter gener- 
ates only 1% of the gravitational field that is observed. Astonishingly, 
this dark matter seems to follow rather closely the irregularities in the 
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distribution of the luminous matter. This dark matter cannot be in the 
form of star-like dark objects, as shown by the EROS and MACHO mi- 
crolensing surveys, and primordial nucleosynthesis shows that only a small 
fraction can be in the form of baryons. The only sensible hypothesis is 
that there exists another, yet unknown, massive particle, in addition to the 
baryons, in large amounts, which is responsible for the observed missing 
mass. 

In this School, the Primordial Universe is understood as the period from 
the electroweak unification up to the remotest epoch that is accessible to 
our knowledge. It reviews the achievements of the last decade, together 
with the latest new topics. 

S. Lilly tells us about the present nniverse, what is observed, how one 
can describe it simply and abont the explorations at high redshift. We 
now have a serious hint from onr recent past, where galaxies are seen to 
evolve strongly with intense star formation. J. Silk gives us the latest val- 
ues of the cosmological parameters; we now know the value of the Hubble 
constant to within 10%. K. Olive, in a short review of Primordial Nucle- 
osynthesis, shows the recent, rather puzzling, abundance measurements to 
be no longer a burden. F. Bouchet, J.L. Puget and J.M. Lamarre review 
the microwave background fluctuation measurements. The prospects from 
the MAP and PLANCK satellites are impressive. We will know everything 
about primordial fluctuations: the shape of the spectrum, the geometry of 
space after recombination, and the baryon fraction. Balloon-borne flights, 
however, are now starting to be challenging competitors. They just told us 
that our universe is flat! Along the path opened by COBE, this gives obser- 
vational access to the epoch when the universe was no more than 10“^° s 
old, undoubtedly a bridge towards the microphysics that determined our 
origins. 

The microphysics relevant for cosmology is reviewed in the major part 
of these lectures. The good news is that fashions change: the idea now 
is to use the laws of physics that are thought to be realistic by particle 
physicists rather than those that turn out to be convenient for cosmology. 
This is a difficult task. K. Olive reminds us about the basic ideas behind 
the Supersymmetric Theories. The new particles predicted by these theories 
are expected to be at the origin of the observed dark matter. Searches are 
underway to detect these particles. As shown by G. Chardin the detectors 
are now at the limit of reaching the required sensitivity considering the 
expected interaction rates. 

The desires of cosmologists, confronted with the requirements of par- 
ticle physics to explain the evolution of the Universe during inflation, the 
creation of matter soon afterwards, and the appearance of quantum fluctu- 
ations, very likely at the origin of the gravitational structures we see today, 
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are reviewed by A. Linde. The observations allow the cosmological con- 
stant found in the supernova surveys to vary slowly with epoch. This is 
predicted by some theories based on supersymmetry. This component is 
then to be interpreted as a new kind of matter that has been given the 
name “Quintessence”. P. Binetruy reviews the state of the art. R. Kallosh, 
in another timely illustration, emphasizes the very special cosmological role 
of the gravitino. N. Turok discusses the defects that may appear during the 
various phase transitions which occur in the early universe, and the associ- 
ated astrophysical constraints. The electroweak transition now appears to 
be too smooth to be responsible for the origin of the baryon asymmetry, 
at variance with the standard working hypothesis of the last decade. As a 
start to the third part of the course, N. Turok also tells us his views about 
what happened right after the Planck era. 

The final courses deal with the Planck era, just after, or just before. 
Superstrings and M-Theory are the natural extension of the Supersymmetric 
theories, including gravity. T. Banks provides us with quite an appealing 
introduction to these matters. From the symmetry between positive and 
negative times that holds in superstring theory, G. Veneziano shows us the 
how post- and pre-Big Bang eras are related. 

Cosmology and particle physics meet again. They have never been very 
far apart in the last 20 years, but the ties were never so close. Clearly, 
the constraints of particle physics on cosmological scenarios are severe, but 
the reverse also holds: not all theories of the elementary particle interac- 
tions survive when they are required to explain our origins. The major 
issue is still to unravel the nature of dark matter, which possibly appears in 
the form of several, fundamentally different, components. Also, we still do 
not understand how the baryon asymmetry built up in our Universe. The 
most modern theories that unify all fundamental interaction, still await- 
ing experimental confirmation, now give us a hint as to what conditions 
were prevailing not only at the Planck era, but even before the start of the 
Big Bang. Undoubtedly, finding out how these old problems and these new 
ideas are entangled will be the challenge of the next decade. 

We thank the Universite Joseph Fourier, Grenoble, the Centre National 
de la Recherche Scientifique (CNRS) and the Commissariat a I’Energie 
Atomique (CEA) for their continuing support to the Les Houches School. 
This session was held with special support kindly provided by the 
Formation Permanente and the Departement des Sciences Physiques et 
Mathematiques (SPM) of the CNRS, the Ministere de I’Education 
Nationale, de la Recherche et de la Technologic (MENRT), the Direction 
des Sciences de la Matiere and the Service de Physique Theorique of the 
CEA. We thank also the NSF and NASA for supporting US teachers and 
students. 
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The session ran very smoothly thanks to Gislaine D ’Henry, Isabel Lelievre 
and Brigitte Rousset. Thanks are also due to “Le Chef” and his team for 
his great cuisine, as well as to all the people in Les Houches who made this 
wonderful session possible. 



Pierre Binetruy 
Richard Schaeffer 
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THE UNIVERSE AT HIGH REDSHIFT 



S. Lilly 



1 Introduction 

1.1 The formation of structure in the Universe 

Understanding the formation of structure in the Universe is a central theme 
of much of current observational astronomy and is the motivation for the 
construction of many of the current generation of very large (and expensive) 
observational facilities. One of the most observationally challenging areas 
is the effort to understand the physical processes that shape the Universe 
at high redshifts, and in particular those that lead to the formation and 
evolution of galaxies. 

It is generally accepted that structure on the scales of galaxies and larger 
arose in the Universe through the gravitational amplification of small den- 
sity fluctuations produced by physical processes occurring at much earlier 
times. These density fluctuations can be studied on different mass scales. 
On the largest scales (comoving scales of > 100 Mpc) observations of tem- 
perature distortions on the last scattering surface offer a particularly clean 
view since the amplitude of the fluctuations are very small (and securely in 
the linear regime) and the physical conditions are particularly simple. On 
scales of around 10 comoving Mpc the result of these density fluctuations 
can best be studied at the present epoch, where structures are just now 
becoming non-linear. Structures on smaller scales, such as galaxies and the 
small scale structure seen in the intergalactic medium, will generally have 
formed at significantly earlier cosmic epochs and their formation must be 
studied at high redshift. This is the main emphasis of this Chapter. The 
physics of the smallest structures should be quite simple: these will consist 
of warm baryons existing within dark matter potential wells. The physics 
on galactic scales is likely to be more complex. We know that loss of energy 
through dissipation and energy injection via stellar processes (especially su- 
pernovae) will have played a major role in determining the behaviour of the 
baryonic component relative to the dark matter and were thus instrumental 
in shaping the formation and evolution of galaxies. This makes the re- 
construction of information on the primordial density fluctuations difficult. 

© EDP Sciences, Springer- Verlag 2000 
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The Primordial Universe 



But, on the other hand, the very richness of the physics gives us scope for 
a deep understanding of the processes that, after all, represent a significant 
chapter in the story of how we ourselves have come to be. 

These lecture notes are aimed at the students at the Summer School 
who are primarily interested in the physics of the very early Universe but 
who nevertheless wish to obtain a background understanding of the late- 
epoch Universe and of the main physical processes that have shaped its 
appearance. The “high” redshifts of observational astronomers, 1 < z < 10, 
are completely inconsequential compared with the subject matter of most 
of the other lectures at this Summer School. Nevertheless, that part of the 
Universe, in space and time, that is accessible to astronomical observation 
represents a basic reality check on theoretical ideas concerning the very early 
Universe. 

The last decade has seen tremendous progress in opening up the dis- 
tant Universe to observation. However, it should be appreciated that while 
we have obtained a first glimpse of what happened long ago, our view is 
far from complete and, to a large degree, we are still blinkered by that 
which is technologically feasible. It is therefore important to stress that 
the answers to many basic questions about the formation of galaxies and of 
large-scale structure at early times and indeed, even about the properties of 
the present-day Universe, are still quite uncertain. The emphasis on these 
lectures will be on presenting simple physical ideas and on observational 
data. However, it is increasingly apparent that numerical simulations will 
be playing a larger and larger role in developing our understanding of the 
high redshift Universe. 

1.2 Methodologies, opportunities and limitations 

It is of course well known that the finite speed of light allows us to ob- 
serve distant parts of the Universe as they were at an earlier cosmic time 
and thus to directly observe the evolution of the Universe. But it should 
be appreciated that there are obvious limitations that are associated with 
this approach. Trivially, an assumption of homogeneity must be made if we 
are to presume that distant parts of the Universe will evolve into something 
similar to what we see in the local Universe. Furthermore, all studies of evo- 
lution are necessarily statistical in nature: we cannot see individual objects 
evolve (except in the most unusual cases, such as supernovae or gamma-ray 
bursts) and so we must infer the evolution of individual objects from changes 
observed within populations. This opens up many difficulties in defining in 
a consistent way the appropriate populations at different epochs, especially 
since we are nearly always dealing with a continuous distribution in the 
properties that are of interest. 

The available technologies for observation still define what can be ob- 
served and what cannot. At present, we can only observe electromagnetic 
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radiation, so dark matter can only be detected through its gravitational ef- 
fect. Even within the realm of electromagnetic radiation, our view is far from 
complete. It should be appreciated that the traditional “visible” waveband 
(0.4 < A < 1.0 /xm) still gives us by far the “deepest” and most extensive 
view of the distant Universe. The near infrared 1.0 < A < 2.3 /xm has 
recently caught up in sensitivity, but is still limited by the small size of the 
detector arrays. This dominance of the optical/near-infrared is illustrated 
by the fact that the deepest images in the visible and near-infrared wave- 
bands (such as the Hubble Deep Field) detect hundreds of objects arcmin“^, 
while those in the radio, sub-millimeter and X-ray wavebands are limited to 
number densities of order 1 arcmin“^ or less. This represents a significant 
and pervasive bias. 

Theory, and in particular numerical simulation, is playing an increasingly 
important role in developing our understanding of the Universe at earlier 
epochs. In this we are fortunate that the physics of the non-interacting dark 
matter appears to be by nature simple, especially in the linear regime. Thus, 
numerical simulations have been rather successful in reproducing structure 
on scales larger than galaxies. However, the physics describing the be- 
haviour of the baryons within galaxies is much more complicated and we 
are a long way from being able to satisfactorily simulate the processes of 
galaxy formation and evolution. 



1.3 Outline of the lectures 

In these lectures, I start by reviewing some of what we know about the 
Universe at the present epoch, especially in terms of the properties of galax- 
ies and of the galaxy population. In Section 3, I briefly review standard 
Cosmology, not least in the context of interpreting observations of distant 
sources of light, since uncertainties in the cosmological parameters are more 
important at 2 : < 10 than in the very early Universe more familiar to many 
readers of these lectures. I also describe the basic results concerning the 
growth through gravitational instability of density fluctuations in the Uni- 
verse. In Section 4, I continue this approach by briefly describing a number 
of theoretical ideas encountered in the non-linear evolution of density fluc- 
tuations and the behaviour of baryons within dark matter haloes. 

I then turn to review the observational situation, looking at the studies 
of galaxy evolution that have been carried out at moderate and high red- 
shifts and at what we can learn from studies of the absorption of light by 
intervening material. Although we have made great progress in the last few 
years, I hope it will become clear how patchy and incomplete our observa- 
tional picture of the high redshift Universe still is at the present time. The 
separation between theory and observation in these lectures seems natural to 
me (as an observer) in that until quite recently, theoreticians and observers 
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(and even different shades of observers) have pursued rather separate paths 
towards understanding the high redshift Universe. 

I will conclude by arguing that the available observations are broadly 
consistent with the standard picture of a hierarchical assembly of struc- 
ture in the Universe with dissipation playing a key role in the formation of 
galaxies. 

It should be said that the 50-odd pages of these notes make it impossible 
to do justice to this rapidly expanding area of research. Shortly, before this 
summer school, a Conference took place in Berkeley honouring Hy Spinrad’s 
career achievements in this field. The interested reader is referred to the 
800 pages of those conference proceedings for a more complete account [1]. 

2 The present-day Universe 

2.1 Galaxies 

2.1.1 Normal galaxies 

To a large degree, galaxies are the building blocks of the Universe. Within 
galaxies, the complex cycle of stellar birth, life and death, a process that 
has our own existence as a product, takes place, largely unaffected by the 
expansion of the surrounding Universe. Outside of galaxies, we can hope 
that physics is relatively simple and everywhere reflects the fundamental 
expansion of the Universe. 

The dominant component of galaxies is the dark halo comprising 90% or 
more of the mass and extending well beyond the visible regions of the galaxy. 
The existence of the halo and the fact that it is less spatially concentrated 
than the luminous material is clearly demonstrated by the remarkably fiat 
rotation curves of spiral galaxies such as NGC 3198 [2]. Several other lines 
of evidence point to extensive dark matter haloes, including the stability 
of galactic disks, the existence of X-ray haloes around elliptical galaxies, 
and the dynamics of the Local Group, of binary galaxies and of clusters of 
galaxies. The dark matter haloes probed by the fiat rotation curves extend 
smoothly out to the scales of groups of galaxies as probed by binary galaxy 
dynamics. 

Within the dark matter haloes, baryonic material in the form of stars 
is seen in different components. The spheroids are 3-dimensional, generally 
triaxial, structures that are dynamically hot structures with stars on fami- 
lies of 3-dimensional orbits. The stars are generally old and the spheroids 
do not contain large amounts of gas and dust. In contrast, the disks are 2- 
dimensional symmetric structures supported by dynamically cold rotation. 
The disks are gas rich, with a multi-phase medium that includes large Giant 
Molecular Glouds in which stars are continually forming. These different 
families of structure presumably reflect basic differences in their formation. 
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Smaller galaxies frequently exhibit a less regular structure, probably reflect- 
ing the increased importance of non-gravitational forces. 

The material in disks is metal rich and has undergone an approximately 
continuous star-formation history over a substantial fraction of the age of the 
Universe. Two components in the spheroids can be differentiated: a metal 
rich central component and the extended metal-poor haloes, including the 
globular cluster system. The metal-poor haloes likely reflect the first stars 
that form in the vicinity of a galaxy. Coherent motions of stars in our 
own Milky Way halo indicate that a significant fraction of the halo may 
have been the result of accretion of small satellite units (see [3]) and the 
Sagittarius dwarf galaxy [4] is presently merging with the Milky Way. 

In terms of the projected surface brightness on the sky, the classical 
spheroids represented by elliptical galaxies and the bulges of luminous spiral 
galaxies generally follow the de Vaucouleurs r^/'^-law profile: 

log(///e) = -3.33((r/re)°-25 - 1) (2.1) 

while disks follow an exponential profile: 

I = /oe-’'/'^. (2.2) 

The physical origin of these density profiles is not well understood. The 
profiles appear to be produced through the dynamical relaxation of systems 
under gravity while exponential profiles may be a natural consequence of 
infall and conservation of angular momentum. 

A basic description of the galaxy population is provided by the lumi- 
nosity function in some observed band. The luminosity function at visible 
wavelengths is usually expressed in terms of the Schechter function [5] : 

4>{L)dL = (/) * (L/L*)“ exp{— L / L*)d{L / L*) . (2.3) 

The three parameters a, L*, and 4>* describe the faint end slope, the 
luminosity of the exponential cut-off at high luminosities and the den- 
sity normalization respectively. A modern luminosity function is shown in 
Figure 1. 



It should be appreciated that the mix of galaxy types changes with lumi- 
nosity and so different classes of galaxies have different individual luminos- 
ity functions, see e.g. [6-8]. These differences presumably reflect different 
evolutionary paths taken by baryonic material within dark matter haloes, 
perhaps also reflecting different histories of the haloes themselves {e.g. in 
merging history or in the epoch of initial collapse). Generally, the more 
luminous galaxies (certainly at visible wavelengths) are of earlier morpho- 
logical type {e.g. are ellipticals and bulge-dominated spirals). Most of the 
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Fig. 1. Luminosity functions as a function of morphological type, from [8], com- 
puted for Hq = 50 km s“^ Mpc“^. The data and solid lines are from the SSRS2 
survey, the dashed lines represent the CFA survey and the dotted lines represent 
the Stromlo-APM survey. It should be noted that the luminosity function of later 
galaxy types (Irr/pec) is much steeper. 



stellar mass within the luminosity function is in roughly L* galaxies, i.e. in 
elliptical and spiral galaxies. 



2.1.2 Galaxy scaling relations 

One of the most important features of galaxies is that the population ex- 
hibits global scaling relations between the structural parameters introduced 
in the previous section and kinematic measures of circular rotation speeds 
(for disks) or velocity dispersions (for spheroids) . Classical spheroids exhibit 
a remarkably tight “fundamental plane” linking Ve, <J and L, (one projection 
of which is the Faber-Jackson relation between cr and L). 

re (X 



(2.4) 





S. Lilly: The Universe at High Redshift 



11 



a (X L 



0 . 62 ^ 



- 0.50 

e 



(2.5) 



Spiral galaxies exhibit the Tully-Fisher relation, which may be written: 



Li = 2.0 X 10^° 



'^rot 



200 km s 



-1 



3 

h ‘^Lq. 



( 2 . 6 ) 



Both of these relations are remarkably tight (with dispersions in luminosity 
of about 25%). They can be used to estimate distance to galaxies (since 
inferred luminosities and sizes scale with distance but velocities and surface 
brightnesses do not). 

The origin of these dynamical scaling relations is not fully understood. 
To a certain degree they presumably reflect the uniformity of dark matter 
haloes produced in hierarchical models of structure formation (see 
Sects. 4.2 and 4.5 below). The slope and small scatter of the Tully-Fisher 
relation is reproduced by high resolution numerical simulations of dark mat- 
ter and baryons (with reasonable assumptions about star-formation as a 
function of gas density) but these fail to account for the normalization (see 
[9] and references therein). Parameters such as the mass-to-light ratio of 
the stellar population, the disk rotation speed relative to the halo circular 
velocity and the baryonic mass fractions would all be expected to produce 
dispersions in the relation. It is likely that physical processes act to produce 
correlations between these that parallel the underlying relation (see [9] for 
a discussion). 

Likewise, in the case of early-type spheroidals galaxies, it is often argued 
that the impressively small scatter is the result of a remarkably uniform 
formation and evolutionary history. However, there are also degeneracies 
that may mask an underlying dispersion, including the well-known age- 
metallicity degeneracy for older stellar populations [10]. Additional clues to 
the formation come from the relations between metallicity and luminosity 
[11] that are most naturally accounted for in terms of supernovae-driven 
winds expelling gas from smaller potential wells. My own interpretation 
of the scaling relations and the Mg-u relation is that it suggests that the 
formation of the dominant stellar populations in the spheroids took place 
within the haloes in which these stars are seen today. This would still be 
consistent with a hierarchical assembly if mergers of galaxies produce large 
star-bursts - a phenomena for which there is in fact a lot of evidence. 



2.1.3 Low surface brightness galaxies 

It is an interesting coincidence that the central surface brightness of disk 
galaxies in the H-band is close to the surface brightness of the dark night 
sky and a significant concern is whether we are missing a large number of 
low surface brightness galaxies of much lower surface brightness (which are 
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harder but no impossible to detect). There is no doubt that some very lumi- 
nous low surface brightness disk galaxies exist {e.g. Malin-1) which have a 
comparable integrated luminosity to that of the Milky Way [12]. However, 
it appears that the comoving number density of these is substantially below 
that of “normal” higher surface brightness galaxies. The reason for these 
lower surface brightnesses is not well understood. Moreover, at fainter lumi- 
nosities, there are very many low surface brightness dwarf galaxies populat- 
ing the faint end of the luminosity function. The number density of these low 
luminosity galaxies is not at all well-constrained and it may be that these 
outnumber the higher surface brightness galaxies. On these mass scales, 
ejection of baryons from energy injection from star-formation is probably 
the cause. These low luminosity galaxies contain a rather small amount of 
mass compared with the more luminous galaxies around L*. 

2.1.4 Dwarf galaxies 

A wide variety of small low mass galaxies exist, collectively known as dwarf 
galaxies. Although these manifest a confusingly broad range of properties 
and appear at first sight quite dissimilar, it is likely that they do in fact 
represent different states of a more uniform type of galaxy. There distinct 
properties relative to more normal galaxies are nicely illustrated by the loca- 
tion of dwarf spheroidal galaxies relative to normal ellipticals and globular 
clusters in the A, re, cr plane [13]. These three types of object all have similar 
old stellar populations making such a comparison more straightforward. 

The dwarf spheroidals have low surface brightness smooth profiles and 
very high M/L values. In the nearest examples within the Local Group, 
in which the stellar population scan be resolved into individual stars, there 
is evidence for multiple stellar populations arising from discrete star-burst 
episodes with ages of a few to several Gyr [14]. The mass-to-light ratios 
are very high, typically 10 times that of old stellar populations in the Milky 
Way. At the other extreme of activity, there are Blue Gompact Dwarf (BGD) 
galaxies. These show evidence for on-going vigourous starburst activity in 
which the star-formation rate is much higher than the time average within 
the galaxy. They exhibit an irregular clumpy morphology and have low 
metallicities (typically 0.1 times solar with a wide dispersion) and have 
only poorly determined mass-to-light ratios (because of the difficulty of 
measuring purely gravitational dynamics) . There is evidence for large scale 
gas-outflows in many objects. Finally, the Magellanic-type dwarf irregulars 
generally show a smooth underlying component of old stars, normal to high 
M/L ratios and metallicities around 0.1 solar. 

All of these phenomena are linked by evidence for episodic star- for- 
mation and large-scale outflows of gas. There are structural and kine- 
matic similarities and it is attractive to view these as different evolutionary 
states modulated by the accumulation, consumption and ejection of gas. 
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As discussed below (Sect. 4.7), it is likely that the differences between these 
dwarf galaxies and more normal galaxies arise from the drastic effects of 
feedback between star-formation and the interstellar medium. Supernovae 
in dark matter haloes with a < 100 km s“^ are able to inject sufficient 
energy to expel the interstellar gas, at least temporarily, from the galaxy 
[15]. 

2.1.5 Active galactic nuclei 

Many galaxies exhibit activity in their nuclei indicative of the presence of 
super-massive black-holes. The relationship between the black holes and the 
formation and evolution of the surrounding galaxies has never been clear. In 
general terms, the gas-rich nature of young galaxies and likely importance 
of merging in assembling galaxies would both be conducive to the feeding of 
a central black-hole. The general relation between black-hole mass inferred 
from dynamical studies and the stellar mass of the spheroidals components 
of the host galaxies [16] suggests that the formation and growth of both 
may very well be linked. 

2.1.6 Ultra-luminous galaxies 

In Section 2.1.4 above, we discussed how some small galaxies are observed to 
be in a state of highly elevated star- formation relative to their time-averaged 
state. Such galaxies are called starbursts. 

It has been known since the IRAS mission, that the most luminous 
galaxies in the local Universe are in fact heavily obscured by dust and radi- 
ate all but a few percent of their luminosity in the far-infrared waveband, 
i.e. between 10 /rm — 1 mm, with a spectral energy distribution peaking 
around 100 /im (see [17] and references therein). The dust has a character- 
istic temperature of 35 — 60 K. The most luminous of these, the so-called 
ultra-luminous infrared galaxies (ULIRGs), attain luminosities in excess of 
1012 Lq. Some of these at least are powered by bursts of star- formation, 
others by AGN (see Sect. 5.1.4 below). The implied star-formation rates 
are very high, approaching 1000 Mq yr“i. 

2.2 The luminosity function and the luminosity density and extragalactic 
background light 

Astronomers detect “light” and two fundamental global quantities are the 
luminosity density of the present-day Universe as a function of wavelength 
and the Extragalactic Background Light (EBL) as a function of wavelength. 
The second quantity is the first integrated over distance incorporating the 
appropriate redshift effects. Recent estimates of these two quantities [18] 
are shown below (Figs. 2 and 3). 
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Fig. 2. The local luminosity density of the Universe, L,^, from [18]. Note that this 
is canted by one power of wavelength relative to Figure 3. The long wavelength 
component comprises about 1/3 of the total and arises from dust warmed by 
absorbing visible and ultraviolet radiation. 




Fig. 3. The extragalactic background light, from [18] which should be consulted 
for a detailed explanation of symbols and models. The important point is that at 
least a half of the EBL emerges at long wavelength. The optical/near-IR and far- 
IR/sub-mm backgrounds dominate over those seen at radio or X-ray wavelengths 
but are themselves dwarfed by the primordial cosmic microwave background. 
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Table 1: The baryon budget [22]. 



Component 


Best estimate 


Range 


Observed at 2 : = 0 


Stars in spheroids 


0.0026/ito 


0.0043 - 0.0014/ifo" 


Stars in disks 


0 . 00086 /ito^ 


0.0013 - 0.00051/ifo^ 


Stars in Irregulars 


0.00007/ito 


0.00012 - 0.00003/1^ 


Neutral atomic gas 


0.00033/ito 


0.0004 - 0.00025/i7o^ 


Molecular gas 


0.0003/ito 


0.0004 - 0.0002/i7o^ 


Plasma in clusters 


0 . 0026 /ito'^ 


0.0044 - 0.0014/i7o^-® 


Plasma in groups 


0.014/ifo^ 


0.03 - 0.007/ifo^ 


Sum 


0.021 


0.04 - 0.007/ito 


Gaseous at z = 3 


Damped absorbers 


0.0015/ifo" 


0.003 - 0.0007/ifo" 


Lyman a forest 


0.04/ifo^-^ 


0.05 - O.Ol/i^ 


Sum 


0.04 


0.05 - 0.01 



An important point in the context of our exploration of the distant 
Universe (which is still heavily based on the optical and near-infrared wave- 
band) is that at least 50% of the energy in the EBL is emitted in the far-IR 
and sub-mm waveband [19-21]. 



2.3 The baryon budget 

Another basic datum is the current distribution of baryons in the Universe 
amongst different components. The overall baryon density of the Universe 
is well-constrained by nucleosynthesis to be far less than the closure density 
and far more than the baryons seen as stars and as in galaxies. Fukugita 
et al. [22] have examined the location of baryons in the present-day (and 
high redshift) Universes and deduce the values in Table 1 (taken from [22]). 

In the context of galaxy formation at high redshifts, two points are 
noteworthy from this table. First, it should be noted that galaxy formation 
must be an inherently inefficient process. Most of the baryons in the present- 
day Universe are not in galaxies at all but in the surrounding intergalactic 
medium. Certainly in the case of clusters, and likely in the case of poorer 
groups, this gaseous material is significantly enriched to about one third 
solar (see [23]) and has therefore been processed through stars. Second it 
should be noted that the majority of present-day stars are in the spheroidal 
components of galaxies, precisely where we do not find significant star- 
formation today (in normal galaxies). 
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3 The theoretical framework. I: Cosmology 

3.1 The Robertson-Walker metric and the appearance of distant objects 

The Robertson-Walker metric that describes an expanding homogeneous 
and isotropic Universe may be written: 

ds^ = c^dT^i?^(r)(dw^ -I- S'^(w)d7^) (3.1) 

with Sk{uj) = Asra.{bj / A) for z = -|-1, S'fc(w) = to for z = 0 and Sk{oS) = 
Asvah{u!/A). The co-ordinate system is chosen so that the set of “funda- 
mental observers” who have constant spatial co-ordinates witness isotropy 
and homogeneity. The time co-ordinate r is then simply the proper time 
for these special observers and thus a sensible “cosmic time” . The scale fac- 
tor R{t) describes the expansion. The departures from Euclidean geometry 
are represented by the Skier) term with A a comoving radius of curvature 
(which is fixed). As an astronomical observer, I prefer this form of the RW 
metric because I prefer an intuitive radial co-ordinate ur because it explic- 
itly retains the units of the radius of curvature A rather than to subsuming 
them into the radial coordinate. 

The redshift, z, is defined observationally as follows in terms of the 
emitted and observed wavelengths or frequencies of light. It is easy to show 
from the metric that the redshift of a distant source is given simply by the 
values of R{t) when the light is emitted and received and that it is simply 
a time dilation effect: 



{1 + z) = Vf,/Vo = RiTo)/RiTe). (3.2) 

Hubble’s parameter is given by: 

H{t) = R-^ X di?/dT. (3.3) 

From the metric, the bolometric brightness of a distant object is given by: 

-^bol 



fhol — 



47tR25'2(^)(1 



(3.4) 



The two (1 -I- z) factors both arise from time-dilation effects. This leads 
to the idea of an “effective distance”, D = RoSk{uj), and a “luminosity 
distance”, Z?l = D{1 + z). As an aside, astronomers usually measure a flux 
density (flux per unit bandwidth in frequency or wavelength space) 



/.= 

fx = 



A. 



47ri?gS'2(a;)(l -k z) 



(3.5) 






47tR25'2(^)(1 + ^)3 



(3.6) 
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For the angular sizes of structures of a given comoving size, the metric gives: 



Sc s(l + z) s 
Skco D Do 



(3.7) 



with Dg = D{1 + z)~^ . Note that Dg generally decreases with redshift 
at very high redshifts, z > 1, so objects of constant physical size actually 
appear larger with increasing redshift beyond a certain point. 

From the above relations, it is easy to show that the apparent surface 
brightness of a source decreases as (1 + (in bolometric terms adjusted 
by (1 + z) for monochromatic surface brightnesses). 



3.2 R{t) and the solutions to the Friedmann equation 

Derivation of the form of R{t) requires solution of the Friedmann equation. 

(dR/dt)^ = -^PR “ ^ • (3-^) 

Solutions to the Friedmann equation depend on the values of p{R) and kA^ 
with 12 playing a key role: 



_ SttGp 
3772 • 

The relationship between curvature and 12 and A is given by: 



1 

R2H2 





(3.9) 



(3.10) 



With R{t) and k/A^ known, the quantity Sk{uj) (or rather Sk{z) since 
it is the redshift z is the only real observable related to distance) can be 
computed since dto /dz = dto /dt x dr /dR x di?/dz. For a general solution for 
a matter dominated Universe, we can find Sk{(u) and the “effective distance” 
introduced above D = RoSk can then be written in terms of what we could 
regard as an “effective redshift” Zq{z) given by the Mattig formula: 



D = RoS,{io) 



c gpz+ (go - 1)((1 + 2goz)°-^ - 1) 

77o 9o(l + z) 



(3.11) 



The quantities D{z) and du/dz are sufficient to derive most quantities of 
interest in interpreting observations at high redshift. As an example, the 
incremental comoving volume element with redshift, dl4/dz, is given, per 
unit solid angle by: 

dW/dz = dw/dz X 12^ (3.12) 

Zq(z) and dw/dz are shown in Figure 4 for three cosmologies: a flat 12 = 1 
matter dominated cosmology; a flat 12 = 0.25 A = 0.75 cosmology; and an 
open 12 = 0.25 A = 0 cosmology. 
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redshift 

Fig. 4. The functions ^q(z), monotonically increasing with redshift, and Aui/dz 
(in units of c/Hq), monotonically decreasing with redshift. The solid curves are 
the emerging standard of Om = 0.25, A = 0.75, the dotted curve is a matter- 
dominated = 1 flat Universe and the dashed curve is an open Om = 0.25, A = 0 
model. 



A well known result is that, in terms of the physical curvature and of 
R{t), non-critical Universes revert asymptotically to the critical flat case 
at high redshift (assuming a constant A). This has consequences both for 
the formation of structure (see below) and for the appearance of objects at 
high redshift. The transitional redshift beyond which the Universe appears 
critical and flat is given by (1 -I- z) ~ for a matter dominated Universe 
and (1 + ^) ~ for the A-dominated Universe, so the transition occurs 

at much lower redshift in the A-dominated case. 
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It will be clear from the foregoing that the distant objects that must 
be studied to directly observe the evolution of the Universe will be faint 
(as I?l), small (as Dg), have lower surface brightness (as (1 + z)"^) and be 
shifted to longer wavelengths (as (1 + z)). 

3.3 Cosmological parameters and uncertainties 

For many years, our uncertainty regarding the cosmological parameters has 
been a major uncertainty in interpreting our observations of distant galaxies. 
Whereas Hq enters as a scaling factor independent of redshift, the choice of 
Ct and A affect relative properties at low and high redshift (see Fig. 4). 

A review of the current constraints on the values of the cosmological 
parameters is beyond the scope of this review. However, I think the evidence 
that the cosmological geometry is flat [24] and that the matter density is 
substantially below the critical value (see e.g. [25] and references therein) 
is increasingly compelling. This implies a substantial A term, which is also 
supported by the Hubble diagram for Supernovae la [26,27]. 

Unfortunately, most of the results for objects at high redshift have been 
calculated (or at least published) for the standard matter-dominated U = 1 
cosmology. As shown in Figure 4, this can introduce a substantial error at 
high redshifts. A systematic recalculation of many standard results in the 
new cosmology, i.e. ft = 0.25 A = 0.75 is overdue. 

3.4 The development of density fluctuations 
3.4.1 Linear growth 

We can regard the density perturbation held A(x) = Sp/p as a Fourier 
integral in comoving wave-number k, 

^ (3.13) 

with a power spectrum P{k), 

P{k) = {\^u?) = A\k). (3.14) 

The initial (primordial) power spectrum P{k) is usually taken to be a power- 
law in comoving wave number k: 



Al cx fc”. (3.15) 

An important quantity is the variance of the mass fluctuations on some 
mass scale. The quantity k^Aj., represents the contribution to the vari- 
ance per logarithmic interval of scale (or mass). For density fluctuations a 
given number, v, of standard deviations away from zero {e.g. for all, say. 
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3ct fluctuations, = 3), it tells us the square of the amplitude of density 
fluctuations on different mass-scales and thus determines the sequence of 
structure formation. It is a dimensionless quantity that depends on the 
power-law index n: 



[A;3|Afc|2] (X (3,16) 

The growth of a particular Fourier component through gravitational in- 
stability in an expanding medium is governed by the Jeans-Lifshitz 
equation: 



d^Afc dR/dt dAfc 

dt'^ R dt 



(dTrGpo - fc^Cs)Afc. 



(3.17) 



The solutions depend on the form of R{t) and for a critical U = 1 matter- 
dominated Universe, 



A = Ao(r/ro)'/3 = Ao{R/Ro). (3.18) 

In the empty U = 0 Universe, or a pure-A accelerating Universe, there is 
no growing mode, and the fluctuations will at best be frozen in at constant 
amplitude. 

3.4.2 Fluctuations in baryonic matter and radiation 

For scales less than the Jeans length, A < Aj, the equation has oscilla- 
tory solutions. It is easy to show that for a fluid composed of matter and 
radiation, that the Jeans length is equal to the horizon scale while the 
Universe is radiation dominated, that it then stays constant in co-moving 
scale until recombination, and then drops precipitously as the coupling be- 
tween matter and radiation is lost. Dark matter has effectively zero Jeans 
length because it is non-interactive. There are damping processes (Silk 
damping [28]) which can operate on fluctuations in the matter and radia- 
tion when they are in this oscillatory phase, due to the diffusion of photons 
out of the density enhancement. But the details need not concern us here 
because fluctuations in the non-interacting dark matter component are not 
affected. 

3.4.3 Modification of the primordial spectrum 

Of more concern are processes that effect the dark matter fluctuations. 
Free-streaming damping can occur for collision-less particles for as long 
as they are still relativistic. At a given epoch, relativistic particles will 
have had time to “free-stream” out of any density perturbations that are 
smaller than the horizon scale. After the dark-matter species becomes non- 
relativistic, free-streaming damping becomes ineffective (since the velocity 
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of the particles drops rapidly). Thus, free-streaming damping modifies the 
initial spectrum by effectively eliminating all density perturbations on scales 
smaller then the horizon scale at the epoch when the particles became non- 
relativistic. The possible importance of free-streaming damping is what dif- 
ferentiates the different brands of non-interacting dark matter. Hot, Warm 
and Cold Dark Matter (CDM) correspond to situations in which the damp- 
ing scale is larger, comparable or smaller than galactic masses respectively. 
CDM, with Mfs much less than galactic masses (the smallest mass scale of 
interest) effectively has no free-streaming damping. 

Finally, “stagspansion” is an effect that occurs within the horizon (ef- 
fectively within the radiation Jeans length) when we have prad Pcdm- 
While the density fluctuation is larger than the horizon, the radiation and 
dark matter perturbation grows in the usual way. But, when the pertur- 
bation enters the horizon, the radiation is prevented from further collapse 
by pressure effects because it is below the Jeans mass (see above). The 
dark matter component in the density fluctuation will in principle be free 
to grow since its Jean’s Mass is negligible (since the dark matter particles 
don’t interact) but the growth will be small for as long as the Universe is 
still radiation dominated {i.e. R < Req) because the collapse time for the 
DM perturbation is much longer than the expansion time for the Universe. 
Only when the dark matter comes to dominate the density of the Universe 
will the normal growth solutions considered above take over. The effect of 
stagspansion is to introduce a change of slope in the perturbation spectrum 
at around the mass scale corresponding to the horizon mass at Req. In 
particular for the n = 1 Zeldovich spectrum, in which the amplitude of the 
perturbations entering the horizon is independent of epoch, the effect of 
stagspansion is to flatten the spectrum of perturbations for M < Mn{Req) 
since the limiting effect of stagspansion is to allow no growth after the fluc- 
tuation enters the horizon. However, this is a slow roll-over and produces 
a very characteristic curvature of the CDM spectrum over a wide range of 
masses (see [29]). If we now look, at some late epoch R > Req, at a pertur- 
bation spectrum that was initially a Zeldovich spectrum with no ~ 1, we 
will find for the parameter 

a (3.19) 

i.e. 

Am oc for large M (3.20) 

Am ~ constant for small M. (3.21) 

The function Am is central to an understanding of the sequence of struc- 
ture formation in the Universe. Recall that it is the square-root of the 
incremental variance in the density field when the Universe is smoothed on 
a comoving scale corresponding to the mass M . 
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The CDM scenario, in which stagspansion modifies the Zeldovich spec- 
trum, is a “bottom-up model” since Am decreases monotonically with mass. 
However, the fiattening of Am on small scales means that, on masses com- 
parable to galaxies, structures on a variety of mass scales are collapsing at 
the same time leading to a picture for galaxy formation in which the collapse 
of sub-units and the merging of these sub-units into larger virialized struc- 
tures is occurring essentially simultaneously. This leads to the expectation 
that galaxy formation will likely be a “process” rather than an “event” , and 
that astrophysical effects affecting the baryons, such as cooling, are likely 
to be important during the hierarchical assembly of the dark matter haloes. 

4 The theoretical framework. II: The non-linear regime 

4.1 Non-linear collapse 

As Sp/p approaches unity, the growth of fluctuations becomes non-linear 
and subsequent increase in 6p/p is rapid. In this section we will look at 
this non-linear collapse and explore some aspects of this that are relevant 
to understanding galaxy formation. By considering a density excess with a 
top-hat density profile, it is easy to show that at turn-around the density is 
about 5.5 times the mean density of the Universe. Following turn-around, 
a spherically symmetric density fluctuation would in principle collapse to a 
point mass in the same time it has taken to turn around. However, since the 
collapsing object is destined to become virialized through violent relaxation 
or some other relaxation process, the density will become stabilized when 
Tvir = 0.5rmax- Thus, virialization supports a collapsing object when it has 
collapsed by a factor of about 2. Consequently, the final density of the 
collapsed object is about 40 times the density of the Universe at the time 
that the fluctuation turned around, or about 160 times the density of the 
Universe at virialization. Locations in the Universe in which the density 
exceeds about 200 times the average density can thus be assumed to be 
virialized. 

Unless energy is lost, the size of the collapsed and virialized object will 
then remain the same for all subsequent time. The density is so much higher 
than the average density of the Universe that the subsequent expansion of 
the Universe as a whole does not affect the internal dynamics of the object. 
However, dissipational processes in the baryonic component may cause the 
loss of energy and thus to further contraction within a dark halo of constant 
size. 

4.2 Hierarchical clustering and dissipation models 

Dissipation involves the radiative loss of “thermal” energy of a system. 
Dissipation is therefore a mechanism whereby an otherwise stable virialized 
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system can undergo further collapse by losing kinetic energy via thermal en- 
ergy, which must be replaced by potential energy, in order to keep satisfying 
the virial condition, resulting in further gradual collapse. Non-interacting 
dark matter cannot dissipate since it cannot radiate electromagnetically. So 
dissipation is an attractive way in which to separate luminous matter from 
the extended dark matter haloes of galaxies, i.e. to produce the observed 
baryon dominance in the centres of galaxies. Indeed, one definition of a 
“galaxy” could be an object in which dissipation has separated the baryons 
from the dark matter. 

In general, cooling mechanisms will be two-body processes, proportional 
to the square of the number density n. 



so 



dE 

dt 



-n^A{T) 



(4.1) 



tcool 



E 

dE/dt 



3kT 

2nk{T) ' 



(4.2) 



At high temperatures, A(T) is dominated by Bremstrahlung emission. At 
temperatures between 10'^ < T < 10® K it is dominated by free-bound 
transitions in H and He. Finally A(T) drops sharply at 10^ K, when kT 
drops below the ionization potential of Hydrogen. We can therefore plot, as 
a function of density and temperature, a locus where tcooi = tcoiiapse- The 
significance of the collapse timescale is that this is the timescale on which the 
object will be taken up into a larger structure. This is known as the Rees- 
Ostriker cooling diagram [30]. Dissipation must be important for systems 
with tcooi < tcoiiapse- For density fluctuations with the same number, v, of 
standard deviations from zero, the initial temperature and density of the gas 
in a collapsed object is set by the virialized velocity dispersion and the turn- 
around epoch respectively [31, 32], i.e. by the mass, while the conditions 
within objects of the same mass will be set by i/. 

Galaxies now are made up largely of non-dissipating stars, so they will 
not currently be dissipating and collapsing. However, if they were once 
completely gaseous then they must have been strongly dissipative and will 
have arrived at their current location in this diagram through the effects of 
dissipation as well as dynamical relaxation processes. On the other hand, 
groups and clusters of galaxies lie outside of this area and dissipation has 
probably never been important for these systems even when and if they were 
completely gaseous. This is why they are structures composed of discrete 
sub-units (galaxies). At the other mass extreme, small gas clouds are never 
able to cool and thus do not form recognizable galaxies. 

The motion of a dissipating object in the n — T plane will depend on 
whether the potential changes as the dissipation occurs. Dissipating gas 
in a non-dissipative dark matter halo will move vertically, changing n at 
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Fig. 5. The Rees-Ostriker cooling diagram from [34] itself adapted from [32]. 
gaseous objects above the cooling locus (computed for different chemical abun- 
dances) will be strongly affected by dissipation. This is where the stellar com- 
ponents of galaxies are found today. Clusters (represented by solid points) lie 
outside of this area. 



constant T, whereas a self-gravitating dissipating gas will move diagonally. 
The present-day location of galaxies on this diagram suggest that the initial 
value of n going into the galaxy formation process is n < —2. This is 
consistent with CDM. As an aside, one can look at the value of n in the 
context of scaling relations for dark matter haloes. For oc we 

get M oc Thus, if the ratio of light emission to dark matter is 

constant and if the circular velocity of the disk is the circular velocity of 
the halo, and all galaxies of a particular Hubble type are associated with a 
given relative amplitude of fluctuation (ie. a given number v of a), then 
we will recover the Tully-Fisher relation L oc if n ~ —3. Furthermore, as 
discussed by Navarro and Steinmetz [9], deviations in these quantities are 
likely to be correlated in such a way as to not lead to an increase in the 
scatter. 
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What can halt dissipation? There are two obvious possibilities: first, 
the baryonic material could become dissipationless by turning into stars. 
This may possibly be what happened in spheroids. Or, angular momentum 
could halt the collapse and form a uniformly rotating disk with small random 
velocities. This may be appropriate for the disks of spiral galaxies. 



4.3 The Press-Schechter formalism 

All of the foregoing has considered density perturbations in isolation. Press 
and Schechter [33] developed a formalism for computing the number density 
of virialised objects within a hierarchy of structures. Peacock [34] gives a 
good discussion of this. The location and properties of bound objects can 
be estimated from filtering of the initial density field. If the density field is 
arbitrary, then the probability that a given point lies in a region with 6 > Sc 
when the density field is smoothed on some scale R{ is 

= (4.3) 

The fraction of material that has collapsed into objects with mass greater 
than M can then be written, in terms of ly = Sc/<j{M), as F{> M) = 
(1 — erf(i//2^/^) where the factor of 2 has been dropped to account for the 
regions with below average density. The mass function f(M) is given by 
Mf{M)/po = \dF/dm\, so: 






d In M d In M 



^ -y 



The mass scales and spectral index n enter this equation through v, which 
we have seen above goes as v = Note that the form of 

/(M) is like a Schechter function, with an exponential cut-off at high ly 
(high M) and a power-law at v <1. However, at low M, f{M) ex 
which for n ~ — 2 is rather steeper than observed in the galaxy population 
but this could well be due to the removal of gas from low mass haloes - 
dwarf spheroidals galaxies appear to have very high mass-to-light ratios. It 
should also be noted that the luminosity function for the bluest galaxies is 
quite steep. (Fig. 1). The M* exponential cut-off is also considerable more 
massive than the masses of L* galaxies in the Schechter function. However 
this is because objects more massive than L* galaxies tend to form groups 
and clusters of galaxies, i.e. the L* in the galaxian luminosity function 
reflects the effects of dissipation (see above) rather than the M* of the dark 
matter haloes. 



M* will increase with time as the amplitude of typical density fluctua- 
tions on larger mass scales increases towards non-linearity, leading to rapid 
changes in the number density of collapsed objects with M > M*, such as 
rich clusters of galaxies (see [35]). 
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4.4 Biassed galaxy formation 

An important question is the degree to which the distribution of galaxies 
produced by the galaxy formation process does or does not trace the un- 
derlying distribution of dark matter. A bias parameter, b, is introduced 
[36]: 



((5p/p)galaxies — ^{^P/ mass- (4.5) 

Bias could arise through a variety of mechanisms. One possibility would 
be a spatially- varying efficiency of galaxy formation. This could happen if, 
say, the baryonic material was affected on large scales by particular events 
in the neighbourhood although the energy requirements are very large. 
A more straightforward effect arises once it is appreciated that the den- 
sity in small density fluctuations will be affected by Fourier components 
on much larger scales, and thus that objects of a given mass will tend to 
collapse sooner when in regions of large-scale over-density [37]. Considering 
individual objects with density amplitude va, the bias will be [38]: 



— 1 

b{n) = 1 + (4.6) 

va 

Thus high v objects, which will be the first to form on a given mass scale and 
which will be rare members of the population of objects of mass M, would be 
expected to be highly biased, and thus to be much more strongly clustered, 
than lower v objects. Biassing will be most important on scales comparable 
to or larger than the M* mass in the Press-Schechter distribution because 
on these scales the existing objects must have had high v in order to have 
formed at all. 

4.5 Origin of angular momentum 

A dimensionless angular momentum parameter, A, is conveniently defined 
in terms of the total angular momentum, kinetic energy and mass of the 
system. 



\= (4.7) 

Unless angular momentum is removed from the system, the initial value of A 
determines the collapse that can occur before rotational support is achieved. 
It turns out that rgnai = 2^/^Annitiai- 

A density fluctuation is believed to acquire non-zero A through tidal 
torqueing between nearby non-spherical perturbations. Analysis of these 
tidal torques suggests that typical values of A ~ 0.07 can be achieved. This 
value of A would produce collapse factors of around a factor ten, roughly 
right for typical spiral galaxies. 
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The peak angular momentum transfer between fluctuations is expected 
to occur around the time of maximum size (i.e. at turn-around). One 
could imagine that, on a given mass scale, A is dependent on the over- 
density of a particular perturbation since this will determine the epoch 
of collapse. More overdense regions {i.e. higher-i/ positive fluctuations) 
collapse faster and therefore have less time for the tidal torquing phase. 
The value of A imparted in this way could clearly vary from galaxy to 
galaxy and this could conceivably be the origin of the Hubble sequence. A 
Hubble-type dependence on the v of each fluctuation could also explain the 
fact that ellipticals are generally found in richer cluster environments, since 
long wavelength [i.e. resulting eventually in very high mass structures) 
fluctuations will boost more short-wavelength fluctuations into the high-i/ 
regime. 



4.6 The structure of dark matter haloes 



The detailed structure of dark matter haloes produced by collapsing den- 
sity fluctuations in N-body simulations has been studied by Navarro et al. 
[39] who And a remarkably homogeneous populaton of haloes is produced 
spanning a wide range of mass and collapse time. 

Pcrit (r/r,)(l + r/rs)2 ^ 

with 



^ _ ?~200 
c 



(4.9) 



200c3 

3(ln(l -I- c - c/(l -I- c)) 



(4.10) 



with c a dimensionless parameter denoting the concentration of the halo 
relative to the fiducial radius r 2 oo within which the density is 200 times the 
critical value, which, as discussed above, represents that radius that should 
be just virializing now. The concentration c effectively gives information on 
the epoch at which the central parts of the halo first collapsed. 



4.7 Feed-back processes 

The injection of energy by supernovae into the interstellar medium within a 
galaxy can regulate star-formation and even expel gas from the galaxy, if the 
energy injected exceeds the gravitational binding energy, Asn > l/2MgV^. 
This may halt star-formation until gas is able to accumulate again. This 
was explored in some detail by Dekel and Silk [15]. 
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Mass loss modifies the scaling relations between L, R, v and the metal- 
licity, Z. Self-consistency is achieved for a fluctuation spectrum of n ~ — 3 
appropriate for CDM-based models on small scales. Interestingly the ability 
of supernovae to eject gas from a halo potential well depends primarily on 
the escape speed and not on the density of gas in the galaxy. This is because 
the rate of SN per unit mass will be proportional to the inverse of the col- 
lapse time, which goes as whereas the energy injected per SN depends 
on the radiative cooling time which goes as so the specific energy 

injection rate is independent of density. The critical velocity dispersion be- 
low which we would expect gas ejection is about 100 km s“^. Reassuringly, 
this is very close to the observed demarcation between dwarf galaxies and 
normal massive galaxies, suggesting that the differences in their properties 
are indeed due to the varying importance of supernova driven mass-loss (see 
Sect. 2.1.4 above). 

While this explains nicely the difference between dwarf galaxies and 
more massive systems, the high metallicity of the intergalactic medium in 
rich clusters of galaxies (and quite possibly in the field) suggests that most 
metals are in fact outside of galaxies and therefore that ejection of enriched 
material was likely a pervasive feature of young galaxies. 



4.8 Chemical evolution 

Heavy elements {i.e. those heavier than He) are almost entirely the product 
of stellar processes occurring well after the Big Bang and the abundance of 
these elements (loosely referred to as “metals” by astronomers) relative to 
Hydrogen offers a basic indication of the degree to which material has been 
processed through stars (see [40] and references therein) . Consider a system 
in which the mass of stars is Mg, the mass of gas is Mg and the mass of 
heavy elements in that gas is Mh. The system is assumed to be “closed” so 
that the total mass is constant. The metallicity is defined to be: 



Mg' 



(4.11) 



In any generation of stars, some of these stars (M > 9 Mq) will explode as 
supernovae after very short lifetimes (enabling us to use the “instantaneous 
recycling approximation”) but lower mass stars will lock up an amount 
SMs of material in long-lived stars or stellar remnants. The mass of heavy 
elements produced by these stars and returned to the interstellar medium 
is defined in terms of a “yield”, p, to be p6Ms. Usually p is assumed 
independent of Z, though it may well not be. Assuming the system is 
closed, then it is easy to show that 



SZ = —p 



SMg 



(4.12) 
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and thus 



(^) ^ 

The metallicity of a system should thus depend only on the yield p and 
the fraction of gas consumed into long-lived stars. Note that the average 
metallicity of the stars after all the gas is consumed is in fact the yield: 
{Z) = p but that the metallicity of the last remaining gas is very high 
(although the cessation of star-formation as the gas is depleted will likely 
terminate this enrichment process before completion. 

Magellanic-type Irregular galaxies fit this relation reasonably well, with 
an implied value of p ~ 0.0025. More massive, earlier spiral galaxies fit it less 
well. Not least the distribution of metallicities in long-lived G-dwarf stars in 
the solar neighbourhood exhibit a substantial deficiency of low metallicity 
stars. Possible and quite plausible solutions to this problem are: (a) stars 
produced early in the history of the disk had an initial mass function (i.m.f.) 
heavily biassed against long-lived low mass stars and remnants; (b) the gas 
in the initial purely gaseous disk was already enriched to 0.25 ^o, perhaps by 
the spheroid; (c) continuing infall of gas on to the disk effectively “dilutes” 
the very low metallicity stars. 

The pattern of metal enrichment of different elements relative to each 
other also contains information on the timescales over which the enrich- 
ment occurred. The two main mechanisms for enrichment of the interstellar 
medium are supernovae SN II resulting from core collapse in massive stars 
and supernovae SN la produced by accreting white dwarfs. These produce 
different relative abundances in that “a-elements” (obtained by assembling 
■^He nuclei into larger nuclei) are enhanced relative to Iron in SN II. whereas 
Fe is predominantly produced in SNIa. The two types of supernovae also 
trace the star-formation rate after different delay times - very short for 
SN II (< 10^ years) and much longer for SN la (> 1 Gyr). The [0/Fe] ratio 
thus represents a clock of the enrichment history. 

The [0/Fe] ratio as a function of [Fe/H] in the solar neighourhood of 
the Milky Galaxy shows a pronounced change at [Fe/H] ~ —1, i.e. 0.1 
solar, almost exactly where the kinematics changes from the halo to the 
disk. This suggests that the enrichment of the halo took place in much less 
than a Gyr. The small scatter in [0/Fe] at [Fe/H] < —1 also suggests that 
the early enrichment was quite uniform, if it had been clumpy, the indi- 
vidual clumps must have proceeded to similar enrichment points by the 1 
Gyr turn-on of SN I. This time scale of 1 Gyr is not much longer than the 
dynamical time of the metal-poor halo, indicating that enrichment probably 
proceeded during initial collapse. The small scatter at [Fe/H] > — 1 suggests 
that the SNl/SNH ratio has been roughly constant over the lifetime of the 
disk, ruling out large changes in the SFR on timescales of 1 Gyr or less. 
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The association of a change in the enrichment pattern with kinematic be- 
haviour may or may not be a causative effect (since the cooling time for gas 
drops significantly at [Fe/H] ~ —1.5). 

4.9 Galaxy spectral synthesis models 

Knowledge of the star-formation history of a galaxy and knowledge of the 
initial mass function of stars produced during star-formation, should allow 
us to construct the brightness of the galaxy at all wavelengths as a function 
of cosmic time. There is an extensive literature of such models [41-44] 
There are however a number of complications. First, some of the post-Main 
Sequence stages of the evolution of stars on the Giant Branch, the Horizontal 
Branch and the Asymptotic Giant Branch are not well understood and 
dependent on the amount of mass-loss from the stars during the later staes 
of their evolution. Secondly, the evolution of stars (including those stages 
affected by mass loss) and their spectra energy distribution, both depend on 
the metallicity of the stars, which will change during the history of a galaxy 
(see 4.7). Thirdly, the initial mass function may change, perhaps reflecting 
different environments of star-formation, i.e. star-formation in a chaotic 
proto-spheroid may produce a different i.m.f. than later star-formation 
in Giant Molecular Glouds in the disks of large spiral galaxies. Finally, 
the effects of dust extinction in modifying the spectral energy distribution 
of galaxies is a significant uncertainty because it depends on the relative 
geometries of sources and obscurers (i.e. on the distribution of dust) and 
on the properties of the dust. 

Nevertheless the latest models are quite sophisticated and give a good 
idea of the evolution of a stellar population. In broad terms, a given stellar 
population starts off bright and blue and dims and reddens with time as 
the Main Sequence burns down. 

Estimating the star-formation rate from gross properties of a galaxy 
is often of considerable interest. This is usually highly dependent on the 
i.m.f. because it is the most massive stars that produce the luminosity at 
early times but the lowest mass stars that contain the mass. Kennicutt [45] 
has given a number of useful conversion factors between observable quan- 
tities and star-formation rates (assuming star-formation is maintained for 
10® years). 



SFR(M© yr-i) = 1.4 x A/WHz"!) (4.14) 

SFR(M© yr-^) = 7.9 x 10-®®L(Ha/W) (4.15) 

SFR(M© yr"^) = 4.5 x 10-®®L(FIR/W). (4.16) 
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4.10 Semi-analytic models 

A number of ab initio models for the evolving galaxy population have been 
based on the Press-Schechter approach. This is used to generate merging 
trees for the dark matter. Simple prescriptions for the behaviour of the 
baryons within these dark matter potential wells, in terms of the cooling 
of gas, the formation of stars and the possible expulsion of gas are then 
applied, with parameters usually adjusted so that some properties of the lo- 
cal galaxy population {e.g. the luminosity function and morphological mix) 
are reproduced. Superficially, these so-called “semi-analytical” models, e.g. 
[46-50] had have had some successes. Not least. Cole et al. [46] largely pre- 
dicted the rough shape of the luminosity density-redshift relation. However, 
these models are still too ad hoc to have real predictive power. More de- 
tailed simulations of individual galaxies {e.g. [9]) treat the behaviour of the 
gas in a more physically detailed way, but even here we are still a long way 
from having adequate simulations to describe the evolution of the galaxy 
population over all epochs. 

5 The formation and evolution of galaxies: The local view 

Nearby galaxies can of course be studied in considerable detail, this being 
especially true of the Milky Way galaxy. The properties of the stellar pop- 
ulations in these present-day galaxies represent a fossil-record of what has 
gone on before. This complements the direct studies at high redshift. 

5.1 Star formation in disk galaxies and starbursts 

In a classic and influential paper, Twarog [51] analysed the number densities 
of F and G stars as a function of age and metallicity in the Galactic disk and 
derived the star-formation history and age-metallicity relation and showed 
that these were more or less self consistent if there was continuing infall of 
material. More sophisticated analyses of the history of star formation in 
the Galactic disk have been recently carried out by Prantzos and Silk [52]. 
These convincingly show that the history of star- formation in the Milky 
Way disk has been extended over cosmological timescales (> 6 Gyr) with a 
smoothly changing star-formation rate that declines by a factor of about 5 
from a broad peak at about 6—12 Gyr ago. 

Turning (more crudely) to other spiral galaxies, the birth-rate parameter 
b = SFR(now)/< SFR > can be estimated from Ha fluxes to determine 
SFR(now) and broad-band continuum measurements to determine (SFR). 
In local galaxies, b varies (with considerable scatter) from 0.1 in Sa disks 
to ~1 in Sc disks to about 2 in typical Sm/Irr’s (see [45] and references 
therein). Thus we would conclude that the later spiral galaxies have had a 
roughly constant star-formation rate over all time. 




32 



The Primordial Universe 



A similar picture emerges from examining the star-formation rate as 
a function of gas density within spiral disks. Normal disk galaxies follow 
quite well a relation of the form [53] Ssfr = with n ~ 1.5 (see [45]). 

Typical disks contain about 20% by mass of gas and convert about 5% of 
that to stars every 10® years, suggesting that the disks are built up over 
a timescale of 10^° years. The gas depletion timescale of 2 x 10® years is 
likely extended by gas infall [54] . Interestingly the more extreme star-bursts 
follow an extrapolation of this law, a reasonable fit for the overall population 
being: 



SsFR = (2.5 ±0.7) X 10-4(Sga«/l Mq pc-®)1-4±°-^®Mq yr'^ kpc"®. (5.1) 

These more vigourous star-bursts have higher efficiencies than the normal 
disks (~ 30% per 10® years). The most extreme star-bursts with implied 
star-formation rates of 500 — 1000 Mq yr“^ will be converting a very sub- 
stantial fraction of the interstellar medium of a galaxy (10^® ^q) into stars 
on the minimum dynamical timescale of 10® years [55]. Mergers between 
massive galaxies are an efficient way of channelling such large amounts of 
gas into the central kpc of galaxies where the surface density reaches high 
enough levels that extreme rates of star-formation can be sustained. The 
most extreme star-burst galaxies lie close to the maximum luminosity pos- 
sible for a self-gravitating system of gas [55] . 

5.2 Spheroids and the elliptical galaxies 

Most of the stars in the Universe today are in the spheroidal components 
of galaxies: the bulges of spiral galaxies and the elliptical galaxies and their 
formation has been the subject of much study and discussion. It is now clear 
that the bulges of spiral galaxies exhibit a wide variety of structure {e.g. 
[56] and references therein). Those of early-type galaxies appear to have 
predominantly profiles, similar to the elliptical galaxies and may have 
formed through similar processes relatively early in the Universe. Later- 
type galaxies {e.g. Sb) have exponential bulges and these may well have 
formed in a more continuous fashion through the secular evolution of the 
disk (see [57,56]). 

Estimating the detailed properties of stellar populations {e.g. ages and 
metallicities) from integrated properties is notoriously difficult on account of 
the degeneracies between age and metallicity in most of the observationally 
accessible indices (see e.g. [10]). Studies of the resolved stellar popula- 
tions in the bulges of the Milky Way and of M 31 indicate that these are 
old, metal rich stellar populations with hardly any intermediate age stellar 
populations. Indeed the detailed elemental abundances in the bulge of the 
Milky Way suggest enrichment of the a-elements [58,59] relative to the Sun. 
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This indicates that bulge formation was completed before the first Type la 
supernovae occurred (since these boost the non-a-elements and in particu- 
lar Fe). The timescale for the appearance of Type la supernovae is not well 
determined but is likely one to a few Gyr. 

Evidence that the population of elliptical galaxies formed early also 
comes from studying the small dispersion in their integrated properties, 
since any variation in age or evolutionary history would introduce a scatter 
(since the properties of any individual galaxy will evolve with time). This 
has been shown for the colour-magnitude relation in Virgo and Coma [60] 
and for the fundamental plane [61]. A small dispersion in (relative) age 
within the population argues for an early formation, since it is natural to 
associate At with the Hubble time at the time for formation (otherwise an 
uncomfortable degree of synchronicity is required). As noted below, sev- 
eral studies have established that the ellipticals in clusters remain a very 
homogeneous population to 2 : ~ 1. However, it should be appreciated these 
constraints do depend on the evolutionary models, and the very high for- 
mation redshifts sometimes claimed for the results of these population- 
dispersion studies are in my opinion not compelling and formation redshifts 
at z ~ 2 are not ruled out (see [62] for a discussion) . 

A further complication is that the most-studied elliptical galaxies have 
generally been those found in rich clusters of galaxies that would be expected 
to have gone through their evolution more rapidly than those in the field 
(because of the higher density and earlier collapse of their environments). 
On the other hand, Bernardi et al. [63] have shown that there is very little 
difference between the Mg-a relation for ellipticals in the field and in rich 
clusters. 

5.3 Ultra-luminous galaxies 

The nature of the energy source in these galaxies has been long debated. 
Recent spectroscopic data from ISO [64] suggests that, at the luminosities 
of the prototypical ULIRG Arp220 (2 x 10^^ h'^Q Lq), both starbursts and 
AGN contribute, with the former dominating at the roughly 3:1 level. Lo- 
cal ULIRGs appear to be generally associated with the mergers of massive 
galaxies, the end-point of which is likely to be the production of an el- 
liptical galaxy. It should be appreciated that the ULIRGs do not contain 
exceptional amounts of dust. The high levels of dust extinction arise because 
the gas is highly concentrated around a compact energy source leading to 
very high optical depths for dust obscuration. Simulations show [65] that, 
during a merger between massive galaxies, the gas is funnelled down to con- 
centrate at the bottom of the potential wells. However, ULIRGs are quite 
rare in the local Universe and objects with luminosities exceeding that of 
Arp 220 (which is the nearest object of its luminosity at 2 ; = 0.03) con- 
tribute only about 0.3% of the total bolometric luminosity output of the 
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local Universe (see [17] and references therein). Thus in global terms they 
might be considered a curiosity at the present epoch. 

6 Evolution at cosmologically significant redshifts 

6.1 Redshifts z > 1 

6.1.1 Methodologies 

The last decade has seen a dramatic improvement in our knowledge of 
“normal” galaxies at significant look-back times. This can be traced to 
three technical developments: the introduction of highly multiplexed multi- 
object spectrographs on 4-m class telescopes; the attainment of HST’s 
full imaging capabilities; and, especially for the highest redshifts z > 2.5, 
the commissioning of the Keck 10-m telescope. The 0 < z < 1 regime 
{i.e. an octave in (1 -|- z)) is of interest because galaxies can be selected 
over this redshift interval in a uniform way entirely within the “optical” 
window 3500 < A < 9000 A allowing the presence of evolutionary effects to 
be unambiguously seen. 

6.1.2 The evolving population of galaxies 

There is now little doubt that significant changes have occurred in the galaxy 
population since z ~ 1 (see e.g. [66,67]). There is a clear increase towards 
higher redshifts in the comoving number density of blue galaxies with mod- 
erate luminosities, Mab{B) ~ 21 (comparable to present-day L*). In con- 
trast the number densities and properties of redder galaxies show much less 
change [66,68-70]. In essence, the broad colour-luminosity relation which is 
seen at low redshifts, whereby the most luminous galaxies in B are generally 
quite red and the bluest galaxies are generally of low luminosity, is eradi- 
cated by z ~ 1. Work over the last 4 years has focused on understanding 
the nature of this evolving population. 

Morphological studies with the HST have demonstrated that this evolv- 
ing population has irregular or late-type galactic morphologies [71,72] and 
that the galaxies are generally quite small with half light radii below 
5 kpc [73]. In contrast, the number of galaxies exhibiting large disks 
seems to have remained roughly constant since z ~ 0.8 [73] although this 
could also be consistent with a growth in disk scale length with epoch if some 
disks were also being destroyed through merging events (see e.g. [74]). The 
degree of surface brightness evolution that has occurred in the disks of nor- 
mal spiral galaxies is not well determined, ranging from 0.8 mag [73] to 
no change in surface brightness at all [75] - the difficulties arising in the 
treatment of selection biases that enter either directly or through other, at 
first sight unrelated, selection criteria. Analysis of the colours and surface 
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brightnesses of the disks in the CFRS sample led Lilly et al. [73] to suggest 
that the star-formation rate was about 4 times higher at z ~ 1 in typical 
spiral galaxies. There is some evidence that barred galaxies may be less 
common in the past [76]. 

The internal kinematics of the galaxies provide a very important di- 
agnostic and there has been a large and rather confusing literature based 
on studies of different samples at different redshifts. The clearest picture 
comes from the work of Vogt et al. [77,78] on the extended rotation curves 
of large disk galaxies out to z ~ 1. These show fairly unambiguously that 
the Tully-Fisher relation is in place for these galaxies and that the offset 
relative to local samples is small. In terms of the smaller galaxies, which 
outnumber the large disks at high redshift, only integrated velocity disper- 
sions have been measured. These are small [79,80] and characteristic of 
dwarf galaxies, a < 100 km s“^. Guzman et al. [79] have argued that many 
of these objects will evolve into dwarf spheroidals, a hypothesis disputed by 
Kobulnicky and Zaritsky [81] on the basis of metallicities. Mallen-Ornelas 
et al. [80] have simply pointed out that these kinematics are consistent 
with the velocity dispersions of galaxies with similar sizes, morphologies and 
colours that are typically found about 2 magnitudes fainter in the present- 
day Universe, although analogues of the same brightness can also be found, 
albeit at much lower comoving number density. 

6.1.3 The early-type galaxy population 

An important and still-unresolved question concerns the number density of 
field elliptical galaxies at high redshifts [18,68,82,83]. This is an interesting 
question as ellipticals may well be the result of major mergers between 
massive galaxies, and at some level, may represent the end-points of the 
evolution of large galaxies. Part of the difficulty lies in defining what is 
an “elliptical” at high redshift in the face of incomplete information (see 
Figs. 5 and 6 of Schade et al. [84]). Colour selection criteria must be 
very carefully applied (and in any case such a cut will include reddened 
spirals and exclude ellipticals with some modest amount of continuing star- 
formation) while morphological criteria are frequently ambiguous in the case 
of small galaxies and kinematic criteria are almost impossible to apply at 
high redshifts. In my opinion the evidence for a significant change in number 
density is weak, but quite large changes (a factor of two) cannot be ruled 
out with the presently available data. At higher redshifts, both Zepf et al. 
[85] and Barger et al. [86] have claimed an absence at z > 1 of galaxies red 
enough to be completely dead elliptical galaxies formed at high redshift. 
However, even quite modest amounts of star-formation (5% of the stellar 
mass produced over a Hubble time) is sufficient to alter the colours enough 
so that the galaxy would never satisfy these colour criteria even if formed 
at arbitrarily early epochs. 
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Fig. 6. A montage of disk galaxies with large scale lengths at 0.5 < z < 0.75 from 
the CFRS, from [73]. The number densities of such galaxies appear to be roughly 
constant to ~ 0.8 and their rotation curves also show little evolution. 



Elliptical galaxies dominate the galaxy populations of rich clusters of 
galaxies. To the highest redshifts that such clusters are known (z ~ 1), 
they contain well-defined populations of red galaxies which still display a 
small scatter in intrinsic properties [87]. 
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6.1.4 The importance or otherwise of mergers 

Another very important (and presumably related) question is the impor- 
tance of merging at 2 ; < 1 (as well of course at higher redshifts). This is 
a surprisingly tricky question, largely because knowledge of the merging 
timescale (or the time that a given phenomenon will be visible) is required 
to go from an observed phenomenology to an astrophysical rate. Most 
observational studies looking at pair counts of galaxies or the morpholog- 
ical signatures of merging have found a substantial increase with redshift 
although there are a number of subtle biases that can enter into such anal- 
yses (see [90,91]). In the Le Fevre et al. [90] study of the HST images 
of CFRS galaxies, we found that by z ~ 1, 15% of images should signs 
of major mergers or interactions even allowing for biasses in the selection 
of the original sample. The fraction of galaxies in close pairs or display- 
ing suspicious morphologies in HST images appears to increase roughly as 
(1 -I- The estimate given by Le Fevre et al. [90] is that typical mas- 

sive galaxies have undergone of order one merger in the 0 < z < 1 interval 
and that the merger activity was concentrated at the start of that period, 
z ~ 1. On the other hand, Carlberg et al. [92] have estimated the same 
quantity from integrating the ^(r, v) correlation function at lower redshifts 
(z < 0.5) and find a weaker rise with redshift. There are however important 
differences between these approaches the latter for instance is sensitive to 
mergers between comparably sized objects and this may be the cause of the 
discrepancies. The whole importance of merging in the assembly of galaxies 
at different epochs is a major open question at this point in time. Toth and 
Ostriker [93] have argued that most spiral galaxies cannot have undergone 
major mergers since the time that their disks were formed since this would 
have disrupted the disks. 

6.1.5 The evolution of galaxies in rich clusters 

There is an extensive literature on the evolution of galaxies within clusters 
indeed the first evidence for any evolutionary effect in the galaxy popula- 
tion came from the study of cluster galaxies at z ~ 0.4 [94] and, until the 
advent of efficient multi-object spectrographs, cluster galaxies were the only 
non-actve galaxies that could be studied at high redshifts. Now that field 
samples (z.e. those selected without regard to cluster environment) are 
available at almost all cosmological epochs, the interest in clusters galaxies 
is more in understanding how their evolution differs from the field and the 
degree to which this is due to the unusual circumstances of the clusters, e.g. 
the early collapse of the cluster, the high velocities between galaxies and 
the high pressure of the intergalactic medium {e.g. [95-98]). 

One of the most interesting results has been the demonstration that the 
uniformity of the elliptical population in clusters is maintained to redshifts 
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approaching 2 ; ~ 1, and that the average properties of the galaxies in terms 
of the colour-magnitude relation and the fundamental plane change only 
slowly with redshift [87,99-101]). This implies early formation for these 
galaxies, but the richest environments would have been expected to have 
formed first. Oemler [102] summarizes the emerging picture: it appears 
that by a redshift of unity, the galaxies in the cores of the clusters have 
evolved into ellipticals. The continuing infall of galaxies into clusters brings 
galaxies that have been continually forming stars. These undergo bursts of 
star-formation (a process that may also be occurring in the field see above) 
that in the cluster environment are frequently followed by a complete loss 
of interstellar medium, truncating the star-formation activity and leading 
to gas-depleted SO galaxies. 

6.1.6 Inside the galaxies 

Finally, it should be stressed that most of the foregoing work (with the 
exception of the rotation curves and the morphological decompositions) 
has been based on the integrated properties of the galaxies. Study of the 
properties of different components within galaxies offers immense promise in 
further understanding the evolution of galaxies and in particular the central 
question of how the formation and evolution of spheroids and disks are 
related. This work is just beginning, pioneered by Abraham et al. [103] and 
will advance rapidly as 8—10 m class telescopes achieve the HST-level spatial 
resolution required to easily resolve the internal structure of galaxies. The 
initial results are encouraging: Abraham et al. [104] presented evidence that 
the high redshift “chain galaxies” previously identified by Cowie et al. [105] 
were produced by star-bursts propagating down linear structures. Abraham 
et al. [103] have studied the spatially resolved colour distributions of normal 
ellipticals and spirals at z ~ 0.5 and derived star-formation histories that 
seem to agree well with those derived for the galaxy population as a whole. 
Future study of the spatially resolved properties of high redshift galaxies 
appears very promising. 



6.2 Redshifts z > 3 

6.2.1 Detection and identification 

Undoubtedly one of the most important breakthroughs made with the Keck 
10-m has been the isolation of star-forming galaxies at high redshifts z > 3, 
selected by the distinctive spectral feature caused by the 912 A Lyman 
limit and extrinsic Lya absorption at A < 1216 A. Galaxies are extremely 
faint shortward of the Lyman limit at 912 A and are faint between 912 < 
A < 1216 A on account of the Lya forest absorption from neutral gas 
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distributed along the line of sight. Thus a distinctive signature of a high 
redshift star-forming galaxy is a strong blue continuum that abruptly dis- 
appears at shorter wavelengths leading to the common description as “U- 
dropouts” ( 2 ; ~ 3), “B-dropouts” (z ~ 4) and so on. Galaxies isolated in this 
way can be spectroscopically confirmed using conventional spectroscopy if 
bright enough (in practical terms AB < 25.5) Steidel et al. [104]. Although 
the potential of this approach had been known for some time [106,107] it 
required the Keck to carry out confirming spectroscopy [104] and it is now 
quite remarkable that there are almost 1000 spectroscopically confirmed 
galaxies at z > 3, as many as at 0.5 < z < 1.5. However, it should be 
appreciated that this technique requires objects to be relatively bright at 
1200 < A < 2000 A and so it is not suited to isolating either quiescent 
galaxies that are not actively forming stars or suffering large amounts of 
dust obscuration. 

Another promising (and also long anticipated) approach has been to find 
high redshift galaxies through strong Lya emission lines [108-112]. The con- 
trast between line and continuum (the equivalent width) increases as (l-|-z). 
In addition, the effects of dust absorption, which are particularly severe for 
Lya because of resonant scattering effects, might be expected to decrease 
at the highest redshifts. Accordingly, most of the highest redshift galax- 
ies have also been found through emission line searches, either through 
serendipitous discoveries in long-slit spectroscopy, slitless spectroscopy or 
narrow-band filter imaging. Most of the z > 5 galaxies have been found 
this way, including the object with the highest claimed (but as yet uncon- 
firmed) redshift [113]. These objects are characterized by strong narrow 
Lya emission. These line-selected galaxies show substantial overlap with 
the continuum “break-selected” objects and some objects at z > 5 have 
only very weak Lya emission [114]. 

6.2.2 Luminosity function and properties 

The luminosity function for the z > 3 population has been constructed [115] 
from the extensive spectroscopic sample and from photometrically estimated 
redshifts for fainter galaxies in the HDF. This shape is remarkably similar 
to that seen at low redshifts in say the Ha luminosity function. Contrary 
to earlier estimates (Madau et al. [116]) the ultraviolet luminosity function 
and by extension the ultraviolet luminosity density (see Sect. 5.5) shows 
little change over the interval 2.7 < z < 4.5. 

The spectral energy distributions of these galaxies are blue and indica- 
tive of stellar populations dominated by young stars. As nicely illustrated 
in Dickinson [117], the spectral energy distributions of the high redshift 
ultraviolet-selected galaxies range between the bluest star-bursts known lo- 
cally and the equivalent with about a magnitude of reddening. Very little is 
known about the existence of old (z.e. Gyr) populations at these redshifts. 
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Morphologically, the galaxies (seen in the ultraviolet) are generally com- 
pact with extended irregular structure [118]. An extremely important result 
[117] is that near-infrared images taken with NICMOS on HST show that in 
most cases the morphologies of the ultraviolet-selected star-forming galaxies 
are similar in the rest-frame visible waveband, indicating that the ultraviolet 
light is not coming from a small region of star-formation existing in a large 
“old” galaxy with regular morphology. There are a few clear exceptions 
to this but most objects have very similar morphologies, suggesting that 
the light at all wavelengths is likely dominated by young stars. Indeed, it 
appears that there is a dearth of large spiral galaxies at z > 2.5, suggesting 
that the classical Hubble sequence of galaxy morphologies is established in 
the 1.0 < z < 2.5 interval. I believe that this is a very important result: the 
z > 2.7 population may or may not be the progenitors of normal galaxies, 
but they should not, I believe, be regarded as “normal” massive galaxies 
similar to those we see today, despite the similarity in the luminosity func- 
tion. This is not to say that the internal physics is not the same - one of 
the impressive things about high redshift galaxies is that they do seem to 
be undergoing similar physical processes as seen locally, but rather that the 
manifestations of these processes is not the same {i.e. star-formation is not 
in disks and so on). 

An important question that enters with any ultraviolet selected sample is 
the amount of dust extinction in the ultraviolet, since this can dramatically 
affect the estimates of star-formation rates from measurements of the ultra- 
violet continuum luminosity. Ideally, the bolometric luminosity of thermal 
dust emission would be measured but this is at present possible for only the 
most heavily extinguished systems (see Sect. 7 below) and estimates must 
rely on estimates of the colour change, either in the ultraviolet or in nebu- 
lar emission lines [119]. Based on standard extinction laws both estimates 
suggest that the ultraviolet continuum is suppressed by modest factors 
(1 — 2 mag or a factor of 2 — 6). Even with this correction, the star-formation 
rates implied for typical L* members of the population are modest, typically 
around 40 Mq yr“^. However, clearly the ultra-violet selected population 
does contain some much more luminous heavily obscured galaxies [120] see 
Section 7 below. 

The metallicities of these galaxies would give important clues as to how 
they relate to present day galaxies, but are hard to measure: Trager et al. 
[121] has argued that the metallicities are low {Z < 0.1 Zq) whereas Pettini 
et al. [199] have made a detailed study of MS1512-cB58 at z = 2.727 [122] 
which is a typical member of the population fortuitously lensed by a factor 
of about 30 by a foreground cluster. They derive metallicities of about 1/4 
solar. The all-important masses of these galaxies are still harder to get at 
and are very poorly constrained. 
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Fig. 7. A montage of galaxies from the HDF observed in the visible (BVI) and 
near-infrared (IJH). The morphologies are similar in both wavebends for the vast 
majority of galaxies, demonstrating that the morphological peculiarities of faint 
galaxies are not simply a wavelength bias, from [110]. 



6.2.3 Clustering and biassing 

Measurements of the clustering of the Lyman break population [123,124] 
have shown that the clustering of bright Lyman break galaxies is as strong 
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as at low redshifts, with a correlation length of about 5 /i^qq comoving Mpc. 
This is taken as evidence for biasing and indeed the change of clustering 
with luminosity (and thus number density on the sky) mimics the effect 
expected in simple biasing models [36] (see Sect. 4.4). One interpretation 
of this is that the ultraviolet luminosity traces dark halo mass and that the 
most luminous galaxies are the progenitors of present-day massive spheroids. 
This may or may not be the case. 

6.2.4 The nature of the Lyman-break population 

Many authors have suggested that the Lyman break population is form- 
ing the massive spheroidal components of galaxies, i.e. the massive bulges 
and ellipticals seen today. While this cannot be ruled out, I am person- 
ally quite struck by the fact that the sizes, colours, morphologies and the 
optical/ultraviolet luminosities of the z > 3 Lyman-break selected galaxies 
are in many ways similar to those of the “small galaxies” population that 
increasingly dominates the z > 0.7 population (see [125]) although we have 
little mass information for the former and almost no metallicity informa- 
tion for the latter. This similarity may or may not be significant, but it 
makes me personally wary of viewing the Lyman break population as nec- 
essarily being the progenitors of today’s massive galaxies. An additional 
uncertainty in placing these galaxies in the context of present-day galaxies 
comes from our almost total lack of information on the much more luminous 
very heavily obscured galaxies which are found in the deep sub-mm surveys 
and which have a luminosity density at least as large as the Lyman-break 
selected population (see Sect. 7). 

6.3 The observational “gap" at z = 2 

As we learn more about the Universe at z ~ 1 and at z ~ 3, the ab- 
sence of information at z ~ 2 becomes increasingly frustrating. This arises 
because of the technical difficulties of carrying out faint object spectroscopy 
at A > 8000 A because of the strong and highly featured atmospheric OH 
background. The strong spectral features at around 4000 A in the rest- 
frame {e.g. the Ca H and K absorption and [Oil] 3727 emission enter this 
region at z > 1, while (as seen above) the next set of strong features around 
Lya 1216 and the 912 A Lyman break does not enter the short wavelength 
end of the optical window until z > 2.5. Both the selection and identifica- 
tion of galaxies at z ~ 2 are therefore difficult and we do not have anything 
like the systematic information available at lower and higher redshifts. 

Quite a large number of relatively bright radio galaxies are known, and 
some fraction of these at least are still demonstrably old at z ~ 1.5 [126,127]. 
Exactly how old has been much debated [128,129]. However, it should be 
appreciated that these could conceivably represent only a small fraction 
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of the galaxy population and that the selection biases associated with the 
strong active nucleus are not well understood. On the other hand, other 
studies {e.g. [130]) of the field population have revealed large numbers of 
small star-forming galaxies. Photometric redshift estimates of the small 
galaxies in the HDF ([131] and references therein) and deep ground-based 
surveys [132] suggest that a large fraction of them lie between 1 < z < 3. 

If we accept the evidence above that massive, rather quiescent galaxies 
with Hubble-type morphologies are present at 2 : ~ 1 but are much rarer (or 
absent entirely) at z > 2.5, then such galaxies {i.e. similar to the Milky Way 
and the bulk, in terms of stellar population, of the local galaxy population) 
must appear during this “gap” epoch. Tracing the emergence of the red 
population from z ~ 2.5 to z ~ 1 is a major goal of new surveys getting 
underway in the near-infrared waveband and will in my view constitute a 
major test of the current paradigm. 

7 The luminosity density as f(z) 

Global measures of the galaxy population such as the mean luminosity den- 
sity of the Universe at ultraviolet wavelengths or in an emission line such 
as Ha or [OH] 3727 are relatively insensitive to the cosmology (depending 
only on the increment in comoving distance dw/dz) and are related (with 
important caveats concerning the effects of dust extinction, the initial mass 
function, and so on) to the production of stars and heavy elements. They 
can thus be related to other global measurements of the Universe such as the 
mean density of neutral gas and the mean metallicity of that gas [168] and 
to other observational quantities that purport to sample the star-formation 
in ways independent of individual galaxies, such as the supernova rate or 
gamma-ray burst rate. 

Most studies have found substantial increases in these quantities with 
redshift to z ~ 1, with exponents, (1 -I- z)“, in the 3 < a < 4 range over 
0 < z < 1. This behaviour has been seen in the 2800 A continuum [33,70], 
in [OH] 3727 [69,134] and in Ha [136]. In the mid-infrared, similar behaviour 
is also indicated [137,138] Nevertheless, Cowie et al. [139] have proposed 
a lower exponent than Lilly et al. [133] for the 2800 A luminosity depen- 
dence, a ~ 1.5 - a discrepancy that can be traced to different assumptions 
made at the highest and lowest redshifts. This issue is clearly not settled 
beyond doubt. At higher redshifts, the initial evidence for a decline in ul- 
traviolet continuum luminosity density with increasing z [116] has not been 
sustained [115] and the available evidence is for a roughly flat luminosity 
density extending to the highest redshifts securely probed, z ~ 5, and even 
possibly beyond [100,131]. 

A major question is the extent to which these measures of luminosity 
density can be taken as a measure of star-formation (it should be 
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Fig. 8. Estimates of the ultraviolet luminosity density of the Universe as a function 
of redshift, taken from [110]. At high redshift the crosses are from [115] and are 
more reliable than the other points. There are still some disputes about the slope 
at < 1. 



remembered that even Ha is affected by extinction) . Indications from mid- 
IR and far-IR surveys [137,138,140] (see Sect. 7 below) suggest that the 
luminosity densities in these wavebands are also increasing at least as fast 
as indicated in the optical/ultraviolet and it therefore seems reasonable to 
assume that these are all due to a general rise in star-formation activity. 

Although the plot of luminosity density (and implied star-formation 
rate) with redshift is hailed as a significant advance in cosmology, it should 
also be remembered that it completely suppresses the individual identity of 
galaxies and thus masks completely the physical processes involved. Fur- 
thermore, it is not completely independent of cosmology. The effect of a 
A dominated cosmology is particularly pronounced in reducing the slope of 
the rise and the sharpness of the change in slope at z ^ 1. 
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8 The cosmic evolution of active galactic nuclei 

The radio-selected active galaxy population were the first population to 
show evolution with cosmic epoch in the 1950’s and soon after the discov- 
ery of optical quasars, it was clear that they too showed strong evolution 
with redshift. Since these early studies the evolution with redshift, and in 
particular the possible existence of a “cut-off” , or rapid decline with red- 
shift, has been extensively studied. 

Radio selected sources are particularly straightforward to study since 
one can be confident that the selection is well-defined, and in particular for 
samples that are completely identified, the effects of dust obscuration can be 
discounted. For several years, the radio studies have indicated a decline in 
the comoving number density at z > 3 [141,142]. The situation for optically 
selected quasar samples has been more complicated because their selection 
criteria are more clearly redshift dependent. 



9 Luminous objects at high redshifts: The hidden Universe 

A cursory glance at the properties of present-day galaxies suggests that 
many galaxies may have gone through a high luminosity phase early in 
their evolution. Not least, the massive metal-rich spheroidal components of 
galaxies, which likely comprise a half to two-thirds of the present-day stellar 
mass in the Universe ([22], see Sect. 2.3), have an age comparable to the age 
of the Universe and, unlike the disks, a structure that is consistent with and 
even indicative of, a rapid and dynamically active formation. Historically, 
much of the motivation for early studies of “faint blue galaxies” was the 
realization by Tinsley [143] that galaxies forming stars at 1000 Mq yr“^ {i.e. 
sufficient to make a substantial galaxy in a dynamical time of order 10® yr) 
should be readily visible at U ~ 20, even at high redshift (z ~ 3). No such 
optically-luminous galaxies have been found, despite surveys penetrating 
to very much fainter levels than envisaged by her. Not least, as noted 
above, the Lyman-break population [104] identified at z > 2.5 is found 
about 5 magnitudes fainter and have implied star-formation rates (in the 
ultra-violet, i.e. assuming no dust obscuration) of order 10 Mq yr“^. Such 
star-formation rates require a full (present-day) Hubble time of continuous 
star-formation to build up to a substantial stellar population. 

The discovery by two experiments onboard CORE [19-21] of a far- 
infrared/sub-millimeter background with a bolometric energy content as 
large or larger than that seen in the optical (see e.g. [144]) emphasized that 
obscuration by dust is an important phenomenon in the early Universe and 
a great deal of effort has been spent in the last couple of years in understand- 
ing the nature of the sources responsible for this emission. The nature of the 
sources responsible for producing this background are being studied using 
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surveys carried out with the ISO satellite (at 15 /rm and 175 /im) and from 
the ground at 850 /im using the new SCUBA bolometer array on the 15 m 
JCMT. Although far from the peak of the background at about 200 /im, 
observations at millimeter wavelengths are potentially an effective way of 
identifying ULIRGs at high redshift. The highly beneficial /c-corrections at 
wavelengths around 1 mm largely compensate for the increased distance, 
resulting in a flat S{z) relation over a wide redshift range, e.g. the proto- 
typical object Arp220 would be approximately as bright at 850 /im at z ~ 8 
as at z ~ 0.5! 

Little is known about this population. At Agso > 2 mJy, 20 — 25% of the 
background is resolved into discrete sources and deeper exposures (or the 
exploitation of gravitational lensing effects) suggest that at S'sso > 1 niJy, 
a half of the COBE background is accounted for. These sources must have 
Arp-220 level luminosities as long as they lie somewhere in the broad redshift 
interval 0.5 < z < 8, and this immediately tells us that such sources must be 
producing a significant fraction of the bolometric luminosity of the Universe 
at high redshifts. A reasonable estimate is that 30% of the bolometric 
output at high redshifts comes from ULIRGs, whereas, as noted above, the 
corresponding fraction in the present day Universe is only 0.3%. 

A major uncertainty concerns the redshifts of the sources. Identification 
with optical/near-IR sources is rendered ambiguous because of the poor 
resolution of the 15m JGMT (15 arcsec FWHM at 850 /im). Some of the 
sub-mm sources are detectable as radio sources [138,145,146] and deep inte- 
grations with millimeter-wave interferometers have resulted in the detection 
of the brighter SGUBA sources but in my opinion, none of the deepest sub- 
mm surveys have been securely identified past the 50% level. In our own 
survey [138,147] we have been impressed by how many sources (about 20%) 
can be reliably identified with low redshift sources z < 1 with many more 
likely lying at z < 3. This limits the fraction of the sample which can lie 
at very high redshifts. A Zmedian ~ 2 is consistent with the extensive spec- 
troscopy of candidate identifications in the Smail et al. cluster lens survey 
presented by Barger et al. [148] even though many of those candidates may 
not be the correct identification. Even quite conservative limits on the frac- 
tion of sources that can lie at high redshifts place quite strong constraints 
on the amount of high luminosity obscured activity that can take place at 
high redshifts. If 50% of the activity occurs at z > 3 then we would expect 
of order 85% of the 850 /xm background to have been produced at z > 3 - a 
scenario that we can to all intents and purposes rule out. Thus it appears 
that much of the activity represented by these ultra-luminous sources occurs 
at lower redshifts, z ~ 2. 

Another major uncertainty concerns the nature of the energy source. 
Similar objects locally appear to be powered by both AGN and starbursts 
and the two phenomena are likely closely related (see Sect. 5.1.4). If true 
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at high redshifts, this would mean that this population is both supplying a 
significant fraction of all stars, and a large fraction of the central black hole 
mass, seen in the local Universe, or both. 

Understanding this population will take some time (because so much 
of the light is obscured) but I have no doubt that their high luminosities 
and high total energy output as a population means that this population 
will hold the key to many questions about the formation and evolution of 
massive galaxies. 

10 Neutral gas 

Gaseous material, whether uniformly distributed or in discrete structures, 
can also be detected through its absorption of the light of background 
sources, such as quasars. Impressively small quantities of neutral 
Hydrogen and of neutral and ionized Helium and other heavier elements 
can be detected in this way. This intervening material which is distributed 
over cosmological scales should be distinguished from absorbing material as- 
sociated with the quasar host galaxy itself {i.e. the broad absorption lines 
and absorption systems at the same redshift). 

10.1 Re-ionization of the IGM 

In all quasars observed so far, the neutral Hydrogen is located at discrete 
redshifts. These structures, originally called “Lyman sa clouds” are better 
viewed as representing a non-linear map of a continuous density field rather 
than as discrete absorption systems. The amount of neutral Hydrogen in a 
uniformly distributed neutral component is measured by the Gunn-Peterson 
test, which looks at the suppression of the continuum shortward of the Lya 
line. This limit is extremely low, Uhi ~ 10“®. The amount of neutral 
Hydrogen observed in the absorbing systems 0.002/i^g^ [149] is comparable 
to the total baryonic material seen today in stars Uhi ~ O.OOS/igg^, but 
represents only a small fraction of the total baryonic content of the Universe 
as constrained by primordial nucleosynthesis to be Uh ~ O.Olh'^Q. We thus 
can conclude that most of the baryonic material, which resides in the Lyman 
a clouds with low column densities of neutral Hydrogen, has been re-ionized 
at some epoch after the 2: = 1000 “recombination”. This re-ionization was 
presumably caused by ionizing radiation emitted by an early generation of 
stars or active galactic nuclei (black hole accretion). This likely occurred 
in the redshift interval just beyond our current measurements, 7 < z < 20 
[150-152], producing expanding regions of ionization which eat away at 
surrounding pockets of neutral material. 

Once the Universe is mostly ionized, ionization can be maintained by 
the diffuse ultraviolet radiation field from the population of young galaxies 
and quasars. The metagalactic radiation field can be indirectly estimated 
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from the proximity effect, which is a decline in the number of absorbing 
systems close to a source of ionizing radiation {e.g. the quasar itself). At 
z = 2.5 this is estimated to be approximately WHz“^ m“^ sr“^ at 

the Lyman limit of 912 A. The known quasar population fails to account 
for this by a modest factor [153] and the shortfall is likely accounted for by 
hot stars in young galaxies. Direct measurements of the ionizing flux may 
soon be possible [154]. 

Interestingly, the epoch of reionization of Helium II {i.e. of He+ to 
He++) appears to have been observed at z ~ 2.9 [155-157]. This suggests 
that whatever produced the initial reionization of Hydrogen at much higher 
redshifts had a softer spectrum than that of quasars since these would have 
been able to fully ionize both Hydrogen and Helium. 

There is now good evidence that the general intergalactic medium as 
probed by the Lya forest has been enriched with heavy elements to a signif- 
icant degree, with metallicities Z = 0.01 [159]. It is natural to assume 

that this enrichment was produced by the first generation of small galaxies 
or star-clusters that formed and which were also responsible for the reion- 
ization of the intergalactic medium. Indeed the amount of star-formation 
required to account for both phenomena is comparable since the number of 
ionizing photons produced per baryon is roughly 0.002ZmpC^ (see [160]) or 
about 10 ionizing photons per baryon for Z ~ 10“"^. 



10.2 High column density systems 

When the column density exceeds A^hi > 2 x 10^° cm“^, a column density 
comparable to a line of sight perpendicular to the inner disk of a present- 
day spiral galaxy), the absorption is heavily saturated. This material is 
almost certainly associated with objects that are or will become substantial 
galaxies. However, the metallicities of the gas are however far below the 
roughly solar values found in such systems today. The metallicities are low 
{Z ~ O.O 2 Z 0 ) [161], but by no means primordial. 

Two models have been proposed for this high column density material. 
The material could be randomly moving within dark matter haloes, or it 
could be in ordered rotation a young Galactic disk. These two pictures could 
well represent different steps along a single evolutionary path depending on 
the time required for gas to settle down into disk like structures. Distin- 
guishing between them is a difficult proposition [162,163]. It may be that 
large ordered proto-disks are present at z ~ 3, but as noted above, I am 
quite impressed by their absence at these redshifts in stellar emission (see 
Sect. 6.2.2). 
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10.3 The Lyman a forest systems 

As noted above, the modern view of the lower column density systems is 
that they represent a non-linear mapping of the general density field and 
one of the great successes of numerical hydro- dynamic models of dark and 
baryonic matter in the high redshift Universe has been the reproduction of 
the statistical properties of the Lyman Forest absorption [164,165,167,168]. 
The Lyo! forest is produced by regions with modest overdensities of below 
a factor of 10 in which the gas and dark matter trace each other quite well. 
There is some self-shielding from the ionizing radiation but the neutral 
fraction is still very small, around 10“^. 

10.4 Global evolution of the neutral Hydrogen content 

An interesting quantity that it is straightforward to estimate is the average 
density of neutral Hydrogen in the Universe. This may be obtained by 
simply integrating the distribution of column densities, which is dominated 
by the higher column density systems {i.e. the damped systems of 8.2). 
The density of neutral gas appears to peak at 0 ~ 3 and to decline at high, 
and especially at lower redshift (Fig. 11). Neutral gas is cool enough that 
stars can form, and it is not surprising that there is more such material 
at high redshifts than at low redshifts. Interestingly, and as noted above, 
the neutral Hydrogen content of the Universe peaks at z ~ 3 at a mean 
comoving density that is comparable to the mean comoving density of stars 
in the present-day Universe. It is therefore natural to ascribe the decline in 
neutral Hydrogen to its consumption into the present-day stellar population 
of the universe. Pursuing this idea, it is noteworthy that Pei and Fall 
[168] actually predicted the steep rise in ultraviolet luminosity density with 
redshift a short time before this was observed [133]. However, the fact 
that the HI density does not decline monotonically indicates that neutral 
Hydrogen is produced as well as consumed as the Universe evolves. 



11 The first stars 

The significant metallicities, and luminosities, of the emission and absorp- 
tion systems known at z ~ 6 and the evident ionization of the intergalac- 
tic medium at this redshift makes it clear that we have not detected the 
first generation of luminous objects. The properties of the first objects 
are highly uncertain. The first stars will form when baryons are able to 
dissipatively cool within dark matter haloes [150,151,169]. The role of cool- 
ing due to molecular Hydrogen is unfortunately uncertain. Prior to re- 
ionization, atomic Hydrogen will cool in halos with virial temperatures of 
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Fig. 9. Estimates of the comoving density of neutral Hydrogen in the Universe 
as a function of redshift, expressed in terms of the density parameter, fl, from 
[149]. The elevated points attempt to take into account possible biases arising 
from reddening and the contribution from lower column density systems. The 
hatched area represents the density of stars in the present-day Universe. The 
small point at z ~ 1 represents an estimate based on 21-cm emission. 

greater than 10^ K whilst after re-ionization, this critical temperature is 
raised to lO"*’® K. Figure 12 (from [169]) shows the virial temperatures of 
dark matter haloes as a function of epoch and the amplitude of the density 
fluctuations (parameterized by the sigma of the fluctuation) in a standard 
CDM cosmogony. Alternatively, if primordial black holes exist, then accre- 
tion onto these could conceivably represent the first sources of light in the 
Universe. 
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Fig. 10. The virial temperature/velocity dispersion of dark-matter haloes of as a 
function of redshift, from [169]. The three curves represent 1, 2, 3 cr peaks in a 
CDM model. The dashed lines are lines of constant mass, while the dotted curves 
give a crude estimate of the expected brightness of the objects. These should be 
detectable with the NGST. 



If the deepest images taken with NGST penetrate to AB ~ 33 over 
the 1 — 5 fj,m waveband then they should be able to detect very low levels 
of star-formation to very high redshifts. NGST should be able to detect 
star- formation rates of only a few Mq yr“^ maintained for 10® years in 
the absence of strong dust extinction) to redshifts of 20 — 40, depending 
on the cosmology. Penetrating to AB = 33 in the continuum gets to the 
approximate level expected for these objects and even fainter continuum 
sources could possibly be identified through strong Lya emission. 
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12 Summary 

In terms of the formation and early evolution of galaxies at high redshifts: 

1 . The Hubble sequence of galaxy morphologies is in place by z ~ 1 and 
the number densities of large ellipticals and spirals has not changed 
dramatically since that epoch. On the other hand, the Hubble se- 
quence is probably not in place at z ~ 3, and the number densities of 
large ellipticals and spirals is probably much lower at this epoch than 
at present. I base this statement on Dickinson’s beautiful NICMOS 
images of the Hubble Deep Field; 

2. Small, irregular, vigorously star-forming galaxies of moderate mass 
are much more prevalent at z ~ 1 (where they dominate) than they 
are now. The present-day descendants of these galaxies is unclear as 
is their relationship to the Lyman-break population at z > 3, to which 
they nonetheless bear considerable similarities; 

3. There is morphological evidence that merging of galaxies was a more 
important (and possibly dominant) phenomenon in assembling galax- 
ies at z > 1 than in the present-day Universe; 

4. Likely related to (4), ultra- luminous dust-enshrouded galaxies at z > 
1 contribute a much larger fraction (about 30%) of the bolometric 
energy output of the Universe than at the present epoch (about 0.3%). 
These high redshift systems could be producing a substantial fraction 
of the present-day stellar population of the Universe, and could most 
naturally be associated with the production of the classical spheroids. 
There is some evidence that this may peak at modest redshifts 2 < 
z < 3; 

5. The neutral Hydrogen content of the Universe at z ~ 4 is equivalent 
to the stellar mass seen in galactic disks today. This material has been 
enriched to about 0.01 times the solar value; 

6. The intergalactic medium at z ~ 5 is largely ionized and has already 
been enriched to about 0.01 times the solar value. 

If we accept the above (and not all will be happy with all of the them), we 
can sketch out the following possible story to link together these disparate 
strands of observational evidence. It should be emphasized that this is 
largely speculative at this stage. 

(a) The first, as yet unobserved, stars form in small galaxies at high red- 
shift, z ^ 5. These re-ionize the intergalactic medium and pollute it 
to a uniform metallicity of about 0.01 solar. Some of these stars are 
seen today in the low metallicity haloes of galaxies; 
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(b) As time passes, merging increases the dark mass of galaxies and be- 
comes increasingly important in the triggering of star formation. Star- 
formation in relatively small galaxies or steady-state star-formation 
at modest levels in larger galaxies would be seen as the relatively 
unobscured ultraviolet-selected Lyman-break population at roughly 
constant number density; 

(c) However, as the masses and metallicities of galaxies built up, mergers 
would become more violent (with gas being compressed deep within 
the gravitational potential wells) and the obscured ULIRGs would 
appear, dominating the star-formation in the Universe at 1 < z < 3 
and plausibly resulting in the spheroids. The quasars, probably arising 
under similar conditions from black-hole accretion, would follow a 
similar evolutionary history; 

(d) By z ~ 1 the worst of the merging would be past. Disks could survive 
and the familiar population of spirals and ellipticals would stabilize. 
The ULIRG and quasar populations would decline dramatically; 

(e) Heading towards the present-epoch, star-formation in disks and in the 
irregular population would decline, presumably as the availability of 
gaseous fuel declined. Star-formation in smaller galaxies would still 
be modulated by the feedback between gas and stars, but would be 
steadier in larger disks. 

In this picture, the different features of the star-formation history diagram 
are accounted for in terms of the merger history, and in particular the differ- 
ent behaviours at z < 1 and z > 1 in the luminosity density diagram could 
reflect phases when merging is and isn’t the dominant process. One could 
speculate why merging might switch off at this point. It could conceivably 
be linked to the transition from a matter to a A-dominated Universe in 
A-dominated cosmologies with Uq = 0.2. 

I would argue that the above picture is broadly consistent with hierar- 
chical models of galaxy formation. It is often remarked that the evolution 
back to z ~ 1, being dominated by smaller galaxies, is precisely the opposite 
of that predicted from hierarchical models in which we would expect small 
galaxies to form first and be assembled into larger ones. I personally don’t 
And this a concern. First it should be remembered that the optically-selected 
population may represent considerably less than a half of the star-formation 
that has occurred in the Universe and that the bulk of the star-formation 
may be going on in the obscured high luminosity ULIRGs. Thus the evolu- 
tionary behaviour at z < 1 could be just an after-thought, occurring after 
the bulk of galaxy formation was completed. This makes sense especially 
if the hierarchical assembly of galaxies ceased at z ~ 1, perhaps due to a 
low-density A-dominated cosmology. 
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Furthermore, the evolution of smaller galaxies is likely to be astrophysi- 
cally quite complex with feedback from star-formation play in a very impor- 
tant role. Thus the conversion of gas into stars within low mass haloes may 
well be quite episodic and not follow simple ideas of the formation of the 
haloes (for instance, there is as yet little evidence that a significant fraction 
of the blue galaxies at z < 1 are in any sense primordial) . If the differential 
behaviour between the optically selected and ULIRG populations which I 
suspect is occurring at z » 2 holds up, then in fact this would be precisely 
the behaviour expected in hierarchical models. 

This picture leads to some clear predictions that should be testable 
within the next few years, (a) the bulk of the present-day spheroid popula- 
tion should appear between 1 < z < 3 even if a few examples of quiescent 
galaxies are found at higher redshifts; (b) the galaxies at z > 3 should have 
masses significantly below those of present-day L* galaxies even if they are 
the pre-cursors of such galaxies; (c) the ultra-luminous galaxies and quasars 
should indeed decline in number density at the highest redshifts. 
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Abstract 

There are ten critical cosmological parameters that monitor cosmo- 
logical models and structure formation. In this review, I summarize 
the current status of these parameters and discuss the issues that are 
presently unresolved. The successes and failures of galaxy formation 
theory are reviewed. The cold dark matter-based models have had 
remarkable success in accounting for various aspects of large-scale 
structure. However on galaxy scales, cold dark matter appears to 
be in crisis. I discuss prospects for improvement in our theoretical 
modelling. 

1 Introduction 

The great achievement of Friedmann and Lemaitre was to realise that 
Einstein’s equations simplify under the assumptions of homogeneity and 
isotropy to an equation for the scale factor a{t) that is the ratio of physical 
to coordinate (or comoving) distance. One can characterise the solutions 
for a{t) by four parameters Hq, Oq, and There are other parame- 
ters that are independently derived but depend on the primary parameters, 
thereby enabling one to develop a self-consistent model. These are the age 
to, deceleration qo and curvature k. In addition there is the contribution 
to Oq from baryons, Ob. Oq refers to matter that is cold (that is, gravita- 
tionally clustered). I add two additional parameters that characterise the 
fluctuations, n and as, to give 10 parameters. The cosmological model is 
overspecified because of the two Friedmann-Lemaitre equations for d and d 
which specify qg and k as functions of Hg, Oq and 

Armed with these parameters, one can attack the problem of struc- 
ture formation. The underlying density fluctuations may be inadequately 
described by a primordial power-law power spectrum. Nor is the nature 
of the ubiquitous and dominant dark matter known with any degree of 
certainty. Nevertheless gravitational instabilility theory provides a remark- 
ably successful infrastructure for understanding the observed structure of 
the universe on large scales. 

© EDP Sciences, Springer- Verlag 2000 
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On smaller scales, those corresponding to galaxies, our understanding 
is in a more primitive state. Galaxy formation theory must account for 
the fossilized properties of galaxies as well as for their dynamical evolution. 
The current time-scales for significant changes in galaxy morphology, dy- 
namics or chemical abundance are extremely long compared to the age of 
the universe. Hence these properties of galaxies are fossilized relics of the 
formation process. Similarly the remarkable correlations, enshrined as the 
Tully-Fisher relation for spirals and the fundamental plane for ellipticals, 
bear witness to the imprint of the formation epoch. The modest, essentially 
passive, evolution in these correlations to z ~ 1 suggests that galaxies were 
already mature systems by then. 

Not only must theory account for the present state of galaxies, but it 
must be capable of predicting the past. The high redshift barrier to detect- 
ing galaxies is being breached: there are now at least three galaxies with 
redshifts above 5. Deep Hubble space telescope images have unveiled the 
morphology of distant galaxies, and in the universe at high redshift we are 
able to see younger counterparts of the nearby galaxies. Galaxy formation 
theory must be capable of reconstructing the star formation history of the 
nearby galaxies. It must be capable of spanning the dark ages that com- 
menced with the cosmic microwave background fluctuation imprint, include 
the epoch of density fluctuation growth, and terminated with the first light 
from primordial star formation. The theory must be able to account for 
such diverse systems as star-forming grand design spiral galaxies, elliptical 
galaxies that have long ceased to form stars, and irregulars that have been 
dubbed the astronomical equivalent of shipwrecks, as well as the energetic 
phenomena that occur in active galactic nuclei and quasars. The theory 
must not only produce these objects as observed near us, but predict the 
properties of high redshift counterparts and incorporate the effects of the 
environment, from rich clusters to the field and even the great voids. 

Theory is confronting the challenge of predicting the properties of the 
distant universe. How successfully this is being accomplished is one of the 
themes of this article. One can imagine two conceptually distinct approaches 
to galaxy formation. One ideally would commence with inflationary fluc- 
tuations and follow their evolution until galaxies formed and evolved into 
mature systems. A complementary approach is to model the star formation 
rate in nearby galaxies and run this backwards in time, incorporating the 
additional astrophysics required to account, for example, for such phenom- 
ena as gas infall that occurs erratically at present but was inferred to have 
played a greater role in the past. Of course, as I shall argue, these ap- 
proaches must merge together in the ultimate theory of galaxy formation. 

The following sections review the status of our knowledge of the various 
cosmological parameters. My quoted errors are often subjective, intended 
to show consensus rather than the precision of individual determinations. 
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I then present the highlights of the ab initio approach to galaxy formation. 

1 follow this with a discussion of backwards formation, and conclude with a 
perspective on where we might hope to be in the next decade. 

2 Temperature 

The microwave background is the most precisely measured of any cosmolog- 
ical component. It is a blackbody at a temperature = 2.728 K which is 
measured to a precision that is unequalled in nature for a blackbody source, 
with no distortions detected to an uncertainty of less than 50 /rK [1]. Useful 
limits on structure formation come from upper limits on spectral distortions. 
The Compton y parameter is measured on 7 degree angular scales to have 
an upper limit (95% confidence) 

y = J ne{kT/meC^)cdt < 3 x 10“®. (2-1) 

This limit strongly constrains late (matter era) energy input, and early 
(radiation era) energy input is constrained by the limit on the chemical 
potential 

/r<6xl0-% (2.2) 



3 Age 

One of the historically controversial parameters has been the age of the 
universe. This is best constrained, independently of cosmological model, 
from the age of the oldest stars. Modern stellar evolution tracks and dating 
of globular cluster stars yields a stellar age of 12 ± 2 Gyr [2]. Adding on 
2 Gyr for the period prior to formation of the Milky Way, the age of the 
universe is estimated to be 



to = 14 ± 2 Gyr. 



(3.1) 



4 Hubble’s constant 

Even more controversial historically has been the determination of 
Hubble’s constant. Long the focus of efforts by distinct groups that of- 
fered determinations differing by five or more standard deviations of esti- 
mated error, the Hubble constant controversy has practically disappeared. 
In large part, this is due to improved calibration with HST of Cepheid 
variables and supernova light curves in nearby galaxies. A value for Hq 
of 65 ± 10 km s“^ Mpc“^ generously encompasses recent determinations 
centering on 60 km s“^ Mpc“^ [3] and 72 ± 10 km s“^ Mpc“^ [4]. Gom- 
bination of Hq and to means that even an Einstein-de Sitter universe, with 
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to = {2/5 )Hq^, is close to being consistent with the observed age of the 
universe. With the preferred Uq ~ 0.3 universe that is currently measured, 
an age as long as 17 Gyr is allowed. 

5 Baryon density parameter 

It is customary to define the mean present density of the universe rela- 
tive to the Einstein-de Sitter density, 3Hq/8ttG. The baryon density Ub is 
determined in two ways. The most precise determination comes from com- 
parison of primordial nucleosynthesis predictions with the observations of 
as primitive He^, and Li® as one can find. High redshift determinations 
of ^H, while limited to only two lines of sight, have converged, as has the 
extrapolation of He^ in low metallicity galaxies to zero metallicity [5]. The 
net result is = 0.015(±0.005), so that Ub = 0.04(±0.01). Modelling 
of the Lyman alpha forest at z « 3 has established that Ub(z « 3) « 0.03. 
Most of the gas is highly photoionized by the radiation field from quasars. 
The robustness of the modelling, via simulation and analytic techniques, 
depends on the fact that flo depends only on the square root of the ionizing 
photon flux, the neutral component being directly measured [6] . 

At somewhat lower redshift, in the range z « 2 — 3, the damped Lyman 
alpha clouds, considered to be the progenitors of galaxies if not of galac- 
tic disks, account for Ub « 0.003. This coincides approximately with the 
baryonic content in stars at z « 0, where the luminous regions of galaxies 
account for a similar amount. In the field, the correction for ionized gas 
at z ~ 0 is uncertain. In halos and in clusters of galaxies, the contribution 
of atomic and ionized diffuse gas to Ub is small or negligible. Attempts 
to account for halo dark matter as cold molecular gas remain controversial. 
Halos, with M/L « 50 relative to the universal value M/L = 1000 H for the 
measured mean luminosity density, account for Hhaio ~ 0.05 in dark matter. 
Halos could mostly consist of dark baryons, although there are probably not 
enough baryons to account for all of the halo dark matter. 

Most baryons are evidently dark, and have become dark between z « 3 
and z « 0. This may be partly an effect of observational selection: it 
is exceedingly difficult to map the local counterparts of the high redshift 
Lyman alpha forest. There could be a substantial mass of (~ 10® K) gas, 
outside clusters and groups, in the filaments and sheets that characterize 
large-scale structure. Such gas has not yet been detected but its presence 
is predicted by high resolution simulations of large-scale structure [7] as 
well as suggested by OVI absorption towards high redshift quasars [8] and 
indications of a gas component with 6- values as high as ~ 80 km s“^ [9]. 

Another likely sink for at least some dark baryons is in the form of 
MACHOs. These compact baryonic objects are likely to be either white 
dwarfs, brown dwarfs or primordial black holes. Primordial black holes are 
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possible MACHOs that can account for gravitational microlensing signals, 
but would not contribute to the baryon budget at primordial nucleosynthe- 
sis. However the MACHO interpretation of microlensing events has also 
been attributed to self-lensing by either SMC or LMC stars, most plausibly 
for the 2 events detected towards the SMC [10]. MACHOs may be brown 
dwarfs for a suitably truncated (and ad hoc) dynamical model of the halo. 
MACHOs are most plausibly old white dwarfs in galaxy halos [11]. There 
is possible evidence for old white dwarfs as MACHO candidates, based on 
proper motions of faint HDF stars [12]. Spectroscopic confirmation supports 
this identification [13]. Such objects could have masses consistent with the 
measured timescales of the LMC microlensing events if these are interpreted 
as being due to halo objects, and be sufficiently common to account for some 
10 percent of the halo dark matter density. 

6 Matter density parameter 

Consider next the total density of non-relativistic matter in the universe, 
denoted at the present epoch by Hq. There are several observational indi- 
cations which support a value Hq = 0.3(±0.1). In order of robustness, these 
are the relative change in frequency of rich clusters between z = 0 and 
z = 1, large-scale bulk flows, the gas fraction in clusters, and the present 
number density of rich clusters. The growth of density fluctuations is ar- 
rested a,t z < flp and indeed systemically becomes suppressed by a factor 
Hq ® relative to the growth in a fc = 0 universe. Galaxy clusters are rare 
objects, forming in the exponential tail of a gaussian distribution of density 
fluctuations, and objects moreover whose formation is well understood in 
terms of gravitional instability and collapse. It is therefore a straightfor- 
ward prediction of a high Hq universe that there is a strong decrease in the 
number of clusters above a specified mass with increasing redshift. A low 
universe, in contrast, has little evolution in cluster frequency at z ^ 1. 
The discovery of at least one rich cluster at z ~ 0.8, for which X-ray tem- 
peratures are measured, and the probable existence of massive clusters at 
z > 1, provide strong arguments for a model with Hq « 0.3 [14], although 
not all authors agree with this conclusion [15]. 

Large-scale bulk flows are predicted, for cold dark matter models, to 
amount to 

~ lOOOcTsf^^’® km s“^, (6.1) 

where the inverse bias parameter erg is the rms density fluctuation amplitude 
at 8h~^ Mpe, the scale at which galaxy number count fluctuations have unit 
amplitude. Measured bulk flows are ~ 300 km s“^ on scales of I ~ 40 Mpe 
[16]. This favours flo ~ 0.3, but the uncertainties are large. It is most 
directly characterized from large-scale flows and cluster abundances which 
require agfl^ ® « 0.7(±0.2). Detailed comparisons of the peculiar velocity 
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and mass density fields are less conclusive. There is a trade-off between 
raising bias or lowering U, and the empirical evidence cannot do much better 
than argue for 0.5 ^ Ug ®crg < 1. Weak lensing measurements suggest that 
CTg ~ 1, as does the reconstruction of the density fluctuation power spectrum 
in the region of overlap between CMB and galaxy redshift survey probes [17]. 

The gas fraction in rich clusters is about 0.15, and adding the stellar 
component gives a cluster baryon fraction of 0.18. For this to be reconciled 
with the nucleosynthesis baryon abundance of 0.03 requires Uq ~ 0.15, 
provided that we assume that galaxy clusters fairly sample the universal 
baryon fraction [18]. The rich cluster number density is higher than observed 
by a factor 10 — 100 if U = 1 at the present epoch. 

However comparison with observations of optical clusters requires one 
to assume a universal galaxy formation efficiency in clusters. Moreover 
X-ray selected cluster samples are gas-biased. One has to assume that the 
intergalactic gas has not undergone excessive preheating, in which case gas 
infall to rich clusters, and especially into cluster cores, would be suppressed. 
Neither of these assumptions can be rigorously justified, and only a grav- 
itational shear-selected cluster sample via ongoing and pending wide field 
weak lensing surveys will be able to definitively address this test. 

7 Cosmological constant 

First introduced and then rejected by Einstein, the cosmological constant 
has subsequently been in and out of fashion. More recently, it has been 
reinterpreted as a vacuum density, for which very early universe spontaneous 
symmetry breaking phase transitions have provided evidence of existence, 
although at an energy density that is of order (100 GeV)^ as compared 
to the observed or constrained current epoch value of around (0.001 eV)"*. 
The corresponding equation of state is p = —p, and from the Friedmann- 
Lemaitre equation for the scale factor 

a 47tG , „ , , 

- = ^(p+3p), (7.1) 

one infers that acceleration of the universe is inevitable even for less extreme 
values of negative pressure, provided that p = wp with w < There is no 
means at present of distinguishing between alternative formulations of the 
generalized equation of state for vacuum energy, including the quintessence 
models in which ru is a function of time. 

However observations of supernovae of type la have provided strong sup- 
port for a cosmological constant, provided that one accepts the case that 
SNIa are good standard candles. Acceleration is inferred and provides a 
measure of Um — Ua, which is found to be negative and ~ —0.2 [19,20]. 
Here Ua is the normalised vacuum density, and the classical deceleration 
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parameter <7o = %■ ~ ^A- The dimming of high redshift relative to low 
redshift supernovae could be an artefact of intervening dust. However no 
colour differences are found, and one would have to appeal to ~ 0.2 mag 
of grey extinction, with no apparent change in the dispersion of supernova 
peak magnitudes, between z = 0 and z = 1. A more serious concern is the 
possibility of evolution of the supernova itself. This could easily happen be- 
cause the precise precursor history of the merging or accreting white dwarf 
is unknown for SNIa, and could plausibly vary as the age decreases and main 
sequence turn-off mass increases of the parent population. It is difficult to 
say if this would affect the supernova light curve: however there are as yet 
no 3-dimensional calculations of white dwarf mergers, so that the theory is 
incomplete. Moreover, only about half of the He/C/0 core implodes and is 
a source of energy for the light curve, so that evolution could conceivably 
result in the 20 percent modulation needed to seriously change the inter- 
pretation of the observed dimming. The most recent controversy centres on 
the fact that a difference in precursor rise times has been claimed between 
high and low redshift supernovae [21]. This is the first tentative evidence 
for possible evolutionary differences as a function of cosmic epoch, and, if 
this result were to hold up, would suggest that the systematic uncertainties 
in the relative calibration of high and redshift supernovae may be unknown. 

8 Spatial curvature 

The Friedmann-Lemaitre energy equation, 

SttGR k 

yields an equation for the curvature 

k = (Ha T — 1)^0' 

Curvature is directly measured by integrating along light rays and measuring 
proper density or the associated angular separation of objects, which depend 
in turn on the proper distance 

ra{t) = a{t) J c-^- (8.3) 

Specific curvature measurements include determination of the angular sizes 
of objects of known proper size, gravitational lensing of distant galaxies 
or quasars by foreground galaxy clusters or galaxies, and measurement of 
the comoving density of well-specified classes of objects. However the most 
promising and most straightforward curvature determination utilizes the 
first acoustic peak of the temperature fluctuations in the cosmic microwave 



( 8 . 1 ) 

( 8 . 2 ) 
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background [22]. The maximum sound horizon is essentially cosmologi- 
cal model-independent. Inflationary fluctuations originate as superhorizon 
metric or curvature fluctuations, and consequently there is a peak wave- 
length that reaches its first peak on the horizon scale at last scattering. All 
scales that are of interest for structure formation are acoustic waves prior to 
matter-radiation decoupling, and eventually dissipate by radiative diffusion 
on subhorizon scales. However a series of coherent peaks develops at last 
scattering, corresponding to the compressions and rarefractions of sound 
waves between the horizon and damping scales at wavenumber 7rn(ustLs)~^ 
for n = 1,2... The most pronounced peak 7r(ustLs)~^ should be visible, 
when projected into spherical harmonics on the celestial sphere, at £ « 220 

(or about 1°) for Ua ~ 0.7 in a spatially flat universe. Negative curvature 

1 

shifts the first acoustic peak by a factor of order towards smaller angu- 
lar scales. Current experiments constrain the spatial curvature to be near 
zero [23,24], with recent experiments constraining 180 ^ i ^ 250 and a 
liberal interpretation of the data allowing 2 > Ua + ^ 0-5. However it 

is anticipated that future experiments will greatly increase the sensitivity 
of this curvature constraint. Of course this conclusion does assume an in- 
flationary origin for the fluctuations. If the fluctuations are, for example, 
of the isocurvature mode as in string models, the most prominent peak can 
occur not at last scattering on the maximum sound horizon Ugfus but, in 
at least some models, can be delayed because the induced velocity pertur- 
bations are tt/2 out of phase with the primary adiabatic peturbations at 
the horizon scale. Hence the canonical peak location in a zero curvature 
background can, in principle, be duplicated by a low density isocurvature 
model with Hq « 4/tt^. 

Gravitational lensing has provided an alternative approach to measuring 
curvature, and more specifically Ua to the extent that this determines Hq- 
Quasar and compact extragalactic radio source splittings argue [25] for a 
modest Ua ^ 0.6, in weak contradiction with the SNIa result of Ha ~ 0.7. 
The frequency of giant arcs actually favours low Hq and negligible Ha [26]. 
In effect, gravitational lensing measures comoving volume, which at large 
redshift is dominated by the matter density: the combination Ha-I-( 1 -I- 2 )^Ho 
is being constrained. Ultimately, we might hope to measure the frequency 
of galaxies at very high redshift. This would provide a strong constraint on 
the comoving volume available above a specified redshift. 

9 Density fluctuations 

Primordial density fluctuations are an essential ingredient in the standard 
cosmological model. Quantum fluctuations imprinted by inflationary cos- 
mology on to macroscopic scales provided a major advance in the pre- 
dictability of cosmological models. Inflation generically results in a nearly 
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scale-invariant (in practice, n « 0.96 whereas n = 1 for strict scale invari- 
ance) power spectrum of primordial adiabatic density fluctuations. Mod- 
ulated by the transition from radiation to matter domination, the relic 
density spectrum is Sp/p oc where rieff is the effective in- 

dex, equal to —1.4 on galaxy cluster scales, —0.96 on much larger scales, 
and —2.94 on dwarf galaxy scales. In the matter-dominated era, the fluctu- 
ations grow via gravitational instability, except on the smallest scales where 
pressure gradients suppress growth of the baryonic component. The primor- 
dial density field can be taken to be Gaussian in the standard model. When 
the first peaks in the Gaussian tail develop sufficiently large amplitude for 
self-gravity to be important, gravitational collapse occurs to form the first 
self-gravitating structures. These are dark halos of mean overdensity about 
200 relative to the background at the onset of the contraction, and within 
which baryons are able to dissipate and cool on dark halo mass scales above 
~ 1O^M0. 

The generic prediction of many (but not all) inflationary models is that 
nearly scale-invariant curvature fluctuations are the source of large-scale 
power in the universe. The natural consequence of scale-invariant curvature 
fluctuations is a bottom-up or hierarchical theory of structure formation in 
which progressively large scales become unstable and collapse with increas- 
ing epoch. The hierarchical prediction arises because the metric perturba- 
tions corresponding to mass fluctuations 6M on scale L can be expressed 
as 



Sk = 



GSM 

Lc^ 



^ (LV 

p \ctj 



(9.1) 



and correspond to the amplitude of the density fluctuation at horizon cross- 
ing. Subhorizon growth beyond the initial value at horizon crossing only 
occurs during the matter-dominated phase of the expansion, thereby gener- 
ating a change in slope in the predicted density fluctuation spectrum from 
Sp/p (X L~'^ on large scales to Sp/p oc constant on scales much less than the 
horizon size at matter-radiation equality, about Mpc. Hence a 

bottom-up sequence of dark structure formation occurs as more and more 
massive peaks develop 

This is true for cold dark matter: introduction of hot dark matter 
suppresses fluctuations via the associated free-streaming on scales ^100 
(1 eV /nil/) Mpc. For an admixture of hot dark matter with cold dark mat- 
ter predominating, the only possibility for hot dark matter to play any role 
in structure formation, partial suppression occurs on small scales, and neu- 
trino masses of 1 — 2 eV are allowed by most of the observational constraints 
apart possibly from that of early structure formation. 

The power spectrum is defined by ((^)^) = / k^P{k)^ and y = 
f 5fce*^’’d^fc. The slope is well determined by GMB observations, most no- 
tably GOBE, to be n = 1 ± 0.2. If Hq = 1, standard cold dark matter fails 
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to account for the simplest parametrization of the power spectrum. This 
utilizes the slope defined by oc fc" with normalization cts specified at 
8h~^ Mpc, where the galaxy count fluctuations have unit amplitude. Deter- 
mination of (Tg from bulk flows is, as previously mentioned, Uo-dependent. 

The combination of redshift survey data, which measure the fluctua- 
tions in luminous matter, with CMB anisotropy analysis, which can be 
reconstructed to yield the fluctuations in dark matter for a specified model, 
may be used to examine biasing on large scales. The CMB-normalized 
power spectrum requires erg ~ 1 for IRAS and blue-selected galaxy sam- 
ples. These are essentially unbiased. Red-selected samples, in which galaxy 
clusters are more prominent, display a small bias, erg ~ 0.8. Mass therefore 
seems to trace light on ~ 10 Mpc scales for field galaxies, while more highly 
clustered objects such as galaxy clusters represent rarer, and hence more 
biased, peaks, in the initially gaussian distribution of density fluctuations. 

The U = 1 standard CDM model predicts excessive power at 

10 — 100 Mpc scales once CMB-normalized. This manifests itself as an- 
tibias, which would be unphysical, and generates both extensive bulk flow 
velocities on 10—30 Mpc scales, and excessive cluster abundances, effectively 
on 10 Mpc scales. There is also a shape problem at 10 — 30 h~^ Mpc, with 
the predicted power being flatter than the observed slope. There are three 
standard solutions that correct the power spectrum normalization and shape 
fit problems at 10 Mpc scales. These are an admixture of 20 percent hot 
dark matter = 6eV), a tilt (n « 0.9) or lowering of Uq to Ug « 0.3. 

The latter forces a power spectrum renormalization (by a factor ~ due 
to the suppressed growth) combined with a shift in the power spectrum 
peak wavelength (oc U“^) to larger scales. The former two solutions, have 
too little power at subgalactic scales to allow sufficient early formation of 
Lyman alpha clouds and star-forming galaxies, required at z > 4. Hence 
low Uq models are required, with or without A which has little effect on 
P{k) as fit to the matter fluctuations derived from redshift surveys. The 
reconstruction of P{k) from CMB fluctuations is sensitive to A, which how- 
ever at specified curvature is degenerate with Uf,. A combination of low 
redshift weak lensing surveys with CMB map fluctuation analysis will be 
required to self-consistently derive Ua as well as Uq and Uf,. 

Perhaps the most striking success has been the ease with which the CDM 
model has accommodated the properties of the intergalactic medium. Gas 
has been included in the large-scale structure simulations, and an ionising 
radiation field as observed from quasars is added. At low redshift, one is 
able to explain (Dave et al. 1999) the detailed properties of the Lyman 
alpha absorption clouds observed towards quasars over the entire range 
of observed column densities, 10^^ to 10^^ Hl/cm^. As mentioned above, 
the value of Ug required for this to work agrees with the nucleosynthesis 
expectation if Um ~ 0.03, confirming that the full baryonic complement of 
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the universe is observed at z « 3. At high z, one runs into a problem, since 
the observed quasar ionization field is insufficient to maintain the ionization 
observed for the intergalactic gas. One may appeal to a population of star- 
forming dwarf galaxies or of low luminosity quasars to provide the shortfall 
in the ionizing flux at high redshift. 

The general agreement of CMB and large-scale structure observations, 
especially in the region where similar ~ 100 Mpc scales are probed, with the 
predicted P{k) for O ~ 0.3 is remarkable. It argues that we cannot be far 
from the correct model, given the concordance between independent data 
sets that probe the universe at z ~ 1000 and z ~ 0. However the fit is not 
perfect, and at least one of the redshift surveys (the only survey in physical 
as opposed to redshift space) has a shape near the 100 Mpc peak that is 
significantly more peaked than the simplest inflationary theories. Either the 
current data suffers from hitherto unidentified systematic errors or we may 
be required to develop a more precise parametrization of P{k) that takes 
account of detailed shape variations. 

One could imagine a primordial feature in the power spectrum near the 
peak. This has been suggested to arise, for example from the effects of 
multiple scalar fields driving multiple stages of inflation [27, 28] or with an 
incomplete first order phase transition responsible for production of large- 
scale bubbles, whose size and frequency must be tuned to fit the data [29]. 
One consequence of such tinkering with canonical inflation is that non- 
gaussianity may be introduced. This would further complicate the determi- 
nation of cosmological parameters. With or without non-gaussianity, excess 
power in P{k) near 100 Mpc could play a role in accounting for results from 
narrow angle surveys that appear to show evidence for anomalous clustering 
power near this scale. The current situation with regard to data will soon 
improve, with the imminent availability of ~ 250 000 2DF galaxy redshifts 
and ~ 10® SDSS redshifts. It is presently premature to devote too much 
attention to possibly anomalous features in the galaxy survey data. 

In summary, the basic ingredients underpinning the model are estab- 
lished and confirmed via several observational probes. The primordial fluc- 
tuations are measured in the cosmic microwave background, where the rms 
amplitude on the last scattering surface, STjT ~ 10“® over angular scales 
from a fraction of a degree up to the observable horizon scale, sets the 
amplitude of the inflationary density fluctuations. These are effectively in- 
terpreted as setting the fluctuation amplitude at horizon crossing. Since 
little subhorizon growth has occurred at last scattering on large angular 
scales, one can directly measure the inflationary imprint of primordial met- 
ric fluctuations. The consequences of linear growth appear in deep galaxy 
surveys, where the density fluctuation field Sp/p can be directly measured, 
at least for the luminous component of the matter. It is a good approxima- 
tion to assume that luminous and dark matter components trace each other 
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on sufficiently large scales, provided one is in the linear regime. Large-scale 
velocity field measurements probe the total density field and generally verify 
the assumption that bias is small on large scales. 



10 Ab initio galaxy formation 

Inclusion of the baryons provokes more serious issues that are mostly unre- 
solved at present. Indeed the most urgent unsolved problem in the standard 
model of cosmology is that of galaxy formation. The theory is phenomeno- 
logical, and is largely based on local measurements of star formation rates 
and the initial mass function of stars. There is no fundamental theory of 
local star formation or of the origin of the mass function of stars. Whether 
these local properties that are essential to galaxy formation operate un- 
changed in extreme environments is unknown. 

Galaxy clusters are the largest virialized systems, and have formed by 
purely gravitational processes. Hence they are relatively straightforward 
to model, the main problem being to simulate a volume large enough to 
include a fair sample of clusters. One can infer, from the observed morphol- 
ogy, the dynamical ages of clusters, which are typically several billion years 
or more, and the initial fluctuation amplitude can be inferred on a scale 
of ~ 10 h~^ Mpc. This result of course is cosmological model-dependent, 
but provides an important constraint on the fluctuation amplitude in the 
non-linear regime. The final probe of the primordial fluctuations on scales 
from 0.1 to 1 Mpc relies on inferring the epoch of galaxy formation. This 
procedure is beset by considerable uncertainty: one has to distinguish dy- 
namical or morphological ages from stellar ages, which will not necessarily 
coincide. The process of age reconstruction requires knowledge of the star 
formation history, and there is no fundamental theory of star formation to 
which one may appeal in simulations of galaxy formation. 

However by incorporating semi-analytic prescriptions for star formation, 
crudely based on observed global star formation rules, into numerical simu- 
lations of large-scale structure that provide the mass distribution, structure 
and location of galaxy halos, semi-analytic galaxy formation has made con- 
siderable headway. Crucial to its success has been the incorporation of cold 
dark matter as providing the bulk of the matter in the universe. Cold dark 
matter consists of weakly interacting particles that are stable relics of the 
very early universe. Since the dark matter is cold at the epoch of fluctuation 
growth, dark halos develop hierarchically over a wide range of scales, from 
sub-solar mass scales up to galaxy cluster scales. The dark side of galaxy 
formation theory makes clear predictions. These have had mixed success. 

There are some discouraging aspects. For example, magnetic fields play 
a central role in local star formation by supporting clouds against premature 
collapse. The cosmic history of magnetic fields is completely uncertain, since 
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there is no consensus on the sites of the dynamos that generated pregalactic, 
protogalactic and galactic fields. Disk formation is understood in terms of 
slow gas accretion and cooling, and spheroids form by mergers. Merger 
theory, in terms of dark halos, is well understood via numerical simulations. 
However the accompanying role of star formation is less secure. Whether the 
star formation precedes most of the merging and occurs in small subunits 
or mostly occurs monolithically in the last major merger is uncertain. 

Simulations are best adapted to handle the dark halo evolution, and 
have been supplemented by semi-analytical techniques to follow the actual 
process of luminous galaxy formation. There are a considerable number of 
parameters that enter galaxy formation theory, and the net result is that any 
specific observational issue can be addressed with apparent success. What is 
more difficult is to simultaneously account for all observational constraints 
on galaxy formation in the context of currently available models. One of 
the first challenges of the cold dark matter theory was to account for the 
clustering of galaxies, and in particular for galaxy correlations over scales 
from 1 to 30 Mpc. Simulations showed that the power-law clustering ob- 
served was readily explained, and predicted the evolution of the clustering 
with epoch. In particular, galaxies were identified as originating from rare 
density fluctuations, and hence as biased tracers of the mass distribution. 
High redshift observations of Lyman-break galaxies at 2 : = 3 — 4 have pro- 
vided enough angular coverage to verify this prediction [30] . The simulated 
properties of clusters, including morphology, abundance, evolution and gas 
fraction, accord well [31] with observations for a CDM-dominated universe 
with « 0.3. Another CDM application of considerable success has been 
to massive galaxy halos [32] The abundances and clustering properties are 
accounted for (again if « 0.3), as is the luminosity function (for galaxies 
of luminosity above ~ 0.1 L*) and the typical halo profile, as measured by 
the rotation curves of massive galaxies [33] . 

11 Cold dark matter: Where we are today 

There are a number of predictions of the CDM model that are in apparent 
conflict with the observational data. For example, the theory of dark halo 
merging generically predicts a tail proportional to M~'^ in the halo mass 
spectrum, yet the observed galaxy luminosity fluctuation is significantly 
fiatter, varying approximately as L~^-^ to at least Mb = —14 {e.g. [34,35] 
if one considers field galaxy populations that are not dominated by irreg- 
ular galaxies) . High resolution simulations confirm the persistence of halo 
substructure [36]. An order of magnitude more satellite dwarf galaxies is 
predicted than observed for the Milky Way galaxy. Even if feedback strips 
the protosatellites of baryons and hence makes them star-poor and dark. 




76 



The Primordial Universe 



many of the dark matter objects will survive, and there is a considerable 
risk of overheating the galaxy disk. 

A standard resolution of the observed satellite problem is to introduce 
feedback from supernova explosions, which drives a wind that preferentially 
disrupts the gas reservoir and thereby inhibits star formation in low mass 
galaxies with low escape velocities [37] . However once a feedback prescrip- 
tion is adopted that preferentially affects low mass galaxies, one can no 
longer modify the luminosities of massive galaxies, unless of course new pa- 
rameters are added. The result is that the normalisation of the Tully-Fisher 
relation, between galaxy luminosity and maximum rotation velocity, fails: 
massive galaxies are found to be underluminous for their mass [38]. This 
is likely to be a generic problem, as might be inferred from any cold dark 
matter model in which on scales of say 10 Mpc, mass traces light. On these 
scales, the effective mass-to-light ratio for the typical object is 1000 Om. 
This is what is found for clusters, for which the model is fully consistent, 
but one would expect the ratio of mass-to-light for massive galaxies to be 
a factor of {Mg/MciY^^ or ~ 10 less than this, namely 30 with 0^ = 0.3. 
This should be compared to the observed value of ~ 10. 

There is also a problem with the inferred dark matter distribution within 
the half-light radii of galaxies as observed at optical wavelengths. Simu- 
lations have predicted a universal halo density profile with an inner cusp 
p oc r~^ and /3 « 1. High resolution simulations find that (3 depends on halo 
mass, and systematically steepens to /3 « 1.5 for sub-i* galaxies [39,40]. A 
related problem may be that the mass-to-light radius within the optical ra- 
dius of galaxies is about a factor 3 higher than is observed. This is found for 
the Milky Way, where the rotation velocity of the sun and the stellar con- 
tent of the galaxy within the solar circle determine the inner mass-to-light 
ratio, and is also manifest in the Tully-Fisher relation. Samples of a thou- 
sand or more nearby spirals find an excellent correlation between maximum 
rotation velocity and luminosity, and reconciliation of the observed normal- 
ization shows that the CDM prediction of mass-to-light ratio, inferred from 
the Tully-Fisher zero point, is too high by ~ 3 [41]. 

The universal dark matter profile has also encountered difficulties in 
accounting for the rotation curves of dwarf spiral galaxies. These ob- 
jects, unlike L* galaxies, are everywhere dark matter-dominated, and hence 
should provide good laboratories for studying dark matter. The dark matter 
profile in the inner cores of galaxy can be directly measured for low surface 
brightness dwarf spiral galaxies. The observed rotation curves often, but 
not invariably, find soft cores, that is to say, constant density dark matter 
core profiles, in contradiction with the generic CDM expectation of a cuspy 
core [42] . It is not yet clear that numerical simulations of the dwarf galaxy 
halos have necessarily reached the resolution required to be confident of the 
predicted cuspy profiles [43,44]. 
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The high resolution numerical simulations have unveiled an especially se- 
rious contradiction with observations of disk galaxies. This is the prediction 
of disk sizes. Earlier work, both analytic and numerical, found that the ini- 
tial dimensionless angular momentum acquired by a protogalaxy, A ~ 0.06, 
is conserved at any given radius as the baryons cooled and contracted in 
the dark halo. The final disk size was predicted to be ~ Arh, where rh is 
the halo radius, and agrees well with the observed disk sizes of ~ 5 kpc. 
The high resolution simulations, however, find that the infall is clumpy 
and angular momentum is transferred outwards as dense clumps of baryon 
sink by dynamical friction into the halo core. The consequence is that the 
predicted disk sizes are too small, by almost an order of magnitude [45]. 

It is likely that all of these problems have straightforward resolutions 
requiring more detailed physical input. Baryonic disk formation, if suffi- 
ciently non-axisymmetric may grossly perturb the inner dark halo profile. 
Massive protogalactic winds may deplete the bulk of the low angular mo- 
mentum gas. Feedback from early star formation may suffice to prevent 
most of the protodisk gas from losing excessive angular momentum before 
most disk stars have formed. Nevertheless, in the absence of quantitative 
modelling of inner halo structure and disk sizes, it is premature to attach 
much significance to the ab initio predictions of young galaxy properties in 
the early universe. 

12 Resolving the CDM conundrum 

One can conceive of two very different approaches towards resolving some 
of the difficulties being encountered in CDM modelling of galaxy formation. 
One might tinker with the particle physics. This could involve abandoning 
CDM, and replacing it by self-interacting dark matter. One would need 
to design dark matter particles that do not annihilate but scatter strongly 
and inelastically. Such models have been proposed (Carlson et al. 1992), 
and should be capable of suppressing cuspy cores and substructure in dark 
matter halos (Spergel and Steinhardt 1999), albeit at the cost of intro- 
ducing other difficulties (Miralda-Escude 2000). Another particle physics 
variant is to modify the primordial power spectrum. Warm dark matter 
would suppress primordial power on subgalactic scales, still generating den- 
sity fluctuations but with the rms density fluctuations a{M) « constant as 
opposed to being weakly divergent on small scales. Structure formation on 
galaxy scales would be more abrupt than for CDM: all substructure would 
form simultaneously with the massive halo. This should produce less sub- 
structure in the final halo (Sommer-Larsen and Dolgov 1999), but the cuspy 
core profiles would probably still be generated. This is because warm dark 
matter is effectively cold at the epoch of galaxy halo formation. A simi- 
lar situation would arise for the intermediate situation, in which the power 
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spectrum on subgalactic scales is only partially suppressed (Kamionkowski 
and Liddle 1999). 

A distinctly different approach is to modify the baryonic physics. One 
could consider initial conditions for disk formation in which the final merger 
consisted of two massive substructures, rather than the more gentle infall 
or merging of minor satellites that is usually favoured. In this situation, the 
protodisk would pass through a massive gaseous bar phase, whose tidal in- 
teractions would eject dark matter in the cusp and smooth out substructure 
at least in the plane. The bar would self-destruct rapidly, within a few dy- 
namical times and only then form the bulk of the disk stars. A key question 
of course is whether enough angular momentum remains to account for disk 
sizes, but starburst wind-driven mass loss of the low angular momentum gas 
that preferentially forms the inner dense protogalactic core could alleviate 
this problem. 

One could also appeal to feedback. However naive feedback parameter- 
isations with a gas ejected fraction proportional to v~"^, where v is the disk 
circular velocity, cannot work, since they are already specified by the semi- 
analytic galaxy formation theory in order to ameliorate the mass function 
problem. A more complicated parameterisation of feedback is needed: in 
other words, additional parameters are required. I describe an example of 
such an approach, appealing to a multiphase interstellar medium, in the 
following section. 

13 An empirical approach to disk star formation 

The advantages of an empirical approach towards forming galaxies are that 
theoretical motivation can be combined with phenomenology to provide 
predictions of forming galaxies at extremely high resolution, limited only 
by nearby observational constraints, and at extremely high redshift. The 
principal weakness is that a model for star formation induced via merging 
must be incorporated, but this is no different from the standard approach 
to merging and hierarchical structure formation. 

The foundation of the empirical model commences with observations of 
nearby disk galaxies. The star formation rate is proportional to gas sur- 
face density, and the empirical fit can actually be improved if a dependence 
on galactic rotation rate is explicitly incorporated. The data must directly 
yield the observed star formation rate per unit area /i* cx /igat, which is con- 
sistent with the naive theoretical expectation /i* oc (Kennicutt 1998). 

Theoretical models which suggest this approximate proportionality include 
spiral density wave-induced molecular cloud collisions and the gravitational 
instabilities of a cold gaseous disk to massive molecular cloud formation. I 
will describe an explicit derivation below which incorporates feedback via a 
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multiphase interstellar medium. In all of these cases, molecular clouds are 
assumed to be the precursor phase to the bulk of galactic star formation. 

Detailed fitting to the Milky Way, where one can incorporate chemical 
evolution as well as the star formation history and radial profiles of gas and 
stars, yields a generalization of the simplest star formation rate law to 



gas) infall. 



This can be taken as an empirical model. Metal-poor gas infall is ubiqui- 
tous in cosmological modelling of early galaxy evolution, and is explicitly 
required in order to reproduce galactic chemical evolution even when other 
parameters such as time-variations of yields or of the IMF are introduced. 
The star formation efficiency e represents the fraction of gas converted into 
stars per dynamical time-scale. 

The key to understanding star formation efficiency lies in deriving the 
velocity dispersion of the interstellar gas. For the smaller clouds, supernova 
remnants dump energy and momentum into the interstellar gas and this 
is balanced by energy dissipated in cloud-cloud collisions. The most mas- 
sive clouds are accelerated via mutual tidal interactions that are driven by 
disk gravitational instabilities. There must be a steady state in which the 
clouds acquire a net momentum, since there are continual dynamical inter- 
actions between clouds of all masses, for example via processes of accretion, 
coagulation and fragmentation. One infers that 



£T = 




IclV 1 

/ ^TrQgas 



0 . 03 . 



Here Vrot is the disk maximum rotational velocity, wsn the specific momen- 
tum injected by supernovae. Id is the typical cloud radius, It is the cloud 
tidal radius, and Qgas is the Toomre parameter for the gas component of 
the disk. Self-regulation of star formation requires that Qgas ~ !> and is 
responsible for the low efficiency of star formation and for the longevity of 
disk star formation over several Gyr. The reason e is so small and star 
formation is inefficient is that the self-regulated disk cannot be very un- 
stable, e.g. with Qgas ^ 1, otherwise the cold gas supply would rapidly 
be exhausted, thereby quenching the disk instability. The disk gas velocity 
dispersion is given by 

CTg = Wrot(eWgas/M*)^^^ « 10 - 20 km s“\ 



where rj is a, second, local self-regulation parameter, of order unity and de- 
fined below, that is associated with cloud-cloud interactions and star forma- 
tion. Hence the physics of disk star formation can be reduced to specifying 
two parameters, e and rj which are self-regulated to be of order unity by the 
physics of global star formation. 
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The feedback processes involve cloud aggregation and star formation. 
The molecular clouds are accelerated tidally by the galactic gravitational 
field and by momentum injection from expanding supernova remnants, lose 
energy via collisions and accumulate mass. Once sufficiently massive, as de- 
termined for example by a magnetically-limited Jeans criterion, the clouds 
collapse and form stars. Cloud growth is regulated by the cloud-cloud inter- 
actions. These can be naively modelled in terms of a macroscopic viscosity. 
The radial flow of gas is limited by the effective viscosity, which therefore 
controls disk and bulge formation. One defines the viscosity coefficient by 

^efF — 2 ; 

where A® is the cloud effective mean free path. If one makes the ansatz 
that the viscous drag time-scale is equal to the star formation time-scale 
tsi as plausibly expected in the viscosity-limited model of self-regulation, 
then an exponential density profile results for the forming disk (Lin and 
Pringle 1987). I define the feedback parameter rj as the ratio of these two 
time-scales: rj = 

Let us now ask the question as to how feedback operates to maintain 
Qgas to be of order unity. The answer must lie in the detailed structure 
of the interstellar gas, and in particular, in the equipartition of energy and 
momentum between the hot phase, directly generated by star formation, and 
the cold phase, within which the stars form. A simple model of porosity for 
the multiphase disk gas allows one to explicitly simulate the feedback. The 
porosity can be defined as 

P = (4- volume of supernova remnant) (supernova rate per unit volume) 
and is related to the hot gas disk volume fraction / by 

P=-log(l-/). 

Simple modelling of spherically symmetric supernova remnants shows that 
P depends primarily on the turbulent pressure Pg = PgUg, and more specif- 
ically, that P (X Pg 

In the case where feedback is important, P ~ ~ 1, and the star forma- 

tion rate is effectively determined. One can effectively derive the previously 
assumed empirical star formation law to be p* cx; Pg^crg”^. Note that the con- 
stant of proportionality, in effect the star formation efficiency, is determined 
by the feedback ansatz. The star formation rate is predicted to depend on 
turbulent velocity dispersion, efficiency increasing with enhanced turbulence 
such as would be induced in a merger. One can now study disk formation 
in some detail. The star formation and viscous time-scales are found to be 
less than the dynamical friction time-scale for clamp infall in the inner dark 
halo, so that protodisk formation results in a disk scale-length of the correct 
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order-of-magnitude. Spheroids are quite another matter however, and must 
be considered in the context of hierarchical structure formation. 



14 Testing models of galaxy formation 



Dark halos are an essential ingredient for a model of galaxy formation. 
Hierarchical formation is sufficiently well tested and appealing that it must 
surely be correct: going to, say, a fragmentation, top-down theory of galaxy 
formation would be a step backwards. An optimist would advocate that 
minor tuning of the CDM model is all that is needed, although a pessimist 
might well abandon CDM. However this is relatively light relief compared to 
the problems inherent in star formation in the context of galaxy formation. 

Star formation is intrinsically a vastly more complex and correspond- 
ingly uncertain phenomenon. There are many possible ingredients that 
are omitted from the simplest prescriptions as previously described. When 
generalized to situations other than those prevailing in a cold, gas-rich disk, 
even the most basic questions remain unsolved. For example, consider the 
following. Stars may have been assembled early in the hierarchy, in many 
small subunits. Alternatively, stars formed monolithically, only after the 
more massive galaxies have assembled. One could imagine either situation, 
or any combination of the two, prevailing, given our limited understanding 
of both global and local star formation. 

One can try to resolve such issues phenomenologically. With regard to 
bulge formation, there are three choices. Bulges either form before disks, 
as in hierarchical modelling, after disks, as suggested by secular evolution 
of a transient bar phase that precedes disk formation, or simultaneously as 
would be appropriate in a monolithical model. In practice, it is likely that 
all three pathways to bulge formation can be found. To decide which is the 
dominant process, one can simply consider in turn each of the three limiting 
situations as prevailing. 

Simple models can be constructed using backwards evolution of disks 
combined with star burst models for the bulges, but varying the relative 
time sequence. Incorporating an appropriate luminosity function, one can 
add the various observational artefacts (including pixelization, noise, surface 
brightness dimming) and try to reproduce images of the Hubble Deep Field. 
Extraction of bulge and disk properties from both the simulations and real 
images provides a means of testing alternative bulge formation scenarios. 
With current data, one cannot distinguish the rival models [46]. However 
NGST will eventually probe deeply enough to provide some discriminatory 
features. 
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Disk formation is based on a reasonably strong theoretical and empir- 
ical framework. Here again there are at last two distinct approaches to- 
wards simulating the distant universe. Hierarchical formation motivates 
the ab initio approach, and semi-analytical formation provides a predictive 
framework. However disk sizes are put in by hand, and this is a dangerous 
procedure. Without an underlying physical theory of disk sizes, one cannot 
test the robustness of the predictions. 

An alternative approach is empirical, using backwards formation. One 
can predict the appearance of disks at early epochs in this model. Again, 
simulations of the Hubble Deep Field via the two complementary approaches 
provide a possible discriminant. One can also attempt to simulate the Tully- 
Fisher relation and its evolution with redshift, and search for variations of 
colour, surface brightness and disk size with luminosity. To avoid selection 
bias, all of this information should be extracted from simulated images. 
The net results from the simulations are somewhat depressing, at least with 
regards to current data. However the much deeper images anticipated with 
NGST, combined with near infrared spectroscopic follow-up using modern 
techniques and very large ground-based telescopes, should enable one to 
discriminate between the extreme models that I have described here. 

With regard to elliptical galaxy formation, the theoretical situation is 
even murkier than for spirals. There is no fundamental theory for star 
formation in a predominantly gaseous spheroid. Observations are well in 
advance of theory. The empirical evidence suggests that ellipticals form 
via merger-induced starbursts, but whether the bulk of the stars formed 
in a single monolithic starburst or in many smaller episodes is unclear. If 
the monolithic model prevails, then the starbursts must have been dust- 
enshrouded. If the latter prevails, and spheroid stars form by a sequence of 
minor mergers, then the naive notion of a single starburst as used in pop- 
ulation synthesis of ellipticals is superceded by a more prolonged phase of 
star formation, dribs and drabs of which may persist for a considerable frac- 
tion of a present Hubble time as occasional minor mergers occur. Spheroids 
should therefore appear to bluen at modest redshift. 

Observational arguments can be made for both cases. Sub-millimetre 
observations with SCUBA have detected a population of ultra-luminous 
galaxies whose inferred (photometric) redshifts extend up to 3 and be- 
yond [47]. By analogy with nearby ultra-luminous infrared galaxies, one 
infers that some of these galaxies have star formation rates in excess of 
300 Mq yr“^ that are presumably merger-induced, and are potentially, as 
judged by nearby examples of de Vaucouleurs profiles in ongoing mergers, 
forming spheroidal galaxies. The counterargument that gives one pause be- 
fore accepting this interpretation is that the contribution of AGN towards 
powering these extreme infrared luminosities is unknown. In the case of 
more modest local counterparts such as Arp 220, the AGN contribution 
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is present but sub-dominant, as inferred from far infrared forbidden line 
intensity ratios [48]. But the situation may differ at high redshift, where 
the merger rate and the AGN frequency are high. 

The star formation rate history of the universe is augmented at z > 2 
by the contribution from such monolithic starbursts. Indeed, the SCUBA 
counts suggest that the comoving star formation rate per unit volume may 
not fall beyond z ~ 2 but continues to stay flat until ^ > 5 [49]. This 
interpretation helps account for the diffuse far infrared background, which 
requires a substantial contribution (at least 50 percent in vi^) to come from 
a population other than the infrared emission from the dusty disk galaxies 
that are observed today [50] . This extra population could be identified with 
protoelliptical starbursts. If so, the inferred chemical evolution provides 
a natural way via starburst-triggered outflows to enrich the intracluster 
medium to near-solar metallicities, as observed. 

The case for prolonged star formation in young ellipticals, as opposed to 
a monolithic starburst, is less direct, and stems from the apparent paucity 
of the red, post-starburst objects that are predicted in the single starburst 
model. One resolution might be that young or intermediate age ellipticals 
are blue rather than red because of the prolonged duration of the star forma- 
tion phase. Blue, and possibly somewhat iregular, ellipticals could hitherto 
have escaped detection in Hubble deep image searches for redshift ~ 1 — 2 
counterparts of the current epoch ellipticals. Galaxy clusters at z ~ 1 are 
not necessarily a good guide to the general population of intermediate age 
ellipticals, since cluster selection introduces a strong bias towards rich, early 
forming and red systems that are simply identifiable as being similar to, and 
even richer than the counterparts of nearby elliptical-rich clusters. 



15 Summary 

It is clear that a fundamental theory of galaxy formation is lacking. It is 
indeed naive to hope that arguments about star formation which may only 
be tested locally should apply at high redshift during the violent process of 
galaxy formation. But there is little choice. Galaxy formation simulations 
rely on simple formulations of phenomenology and on tractability in terms 
of computing requirements. Our prescription for large-scale star formation 
is driven by observations at low redshift, and the modelling is fine-tuned 
by observations of the youthful phases of galaxies at high redshift. Such a 
patchwork approach is not capable of being truly predictive. Nevertheless 
it is the best we can hope to do for the foreseeable future. 

Gonsider some of the physics that is currently omitted from galaxy for- 
mation theory. The list is overwhelming. Turbulence, magnetic fields, 
molecule formation and destruction, dust grain evolution and evaporation, 
cosmic ray heating and ionisation; and these are just the physical processes 
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that are bound to play important roles in nearby star-forming regions. On 
the more global level of astrophysics that is omitted because of ignorance, 
one could mention temporal and spatial evolution of the initial stellar mass 
function, chemical yield variations, and triggering of star formation by su- 
pernovae and more dramatically, by AGN. It is only reasonable to con- 
clude that all predictions about galaxy formation should at present be taken 
lightly. 

We are close to converging on the definitive cosmological model. Most 
parameters of the background model are under control. The age is to = 14 ± 
2 Gyr, the Hubble constant Hq = 65 ± 10 km s“^ Mpc, the baryon density 
Ub = 0.04 ± 0.02, and the density parameter is U = 0.3 ± 0.1. The urgent 
model issue is Ua which is found to be 0.7 but with uncertain systematic 
errors because of possible evolutionary dimming. Gurrent data allows the 
density fluctuations to be represented by n = 1 ± 0.2 and as = 0.8 ± 0.2, 
with an upper limit on a possible tensor or gravitational wave component 
of T/S < 0.2. There are indications of possible additional power in the 
data relative to the standard scale-invariant prescriptions, as evidenced by 
the slope of P{k) near 100 Mpc, excess power at ~ 100 Mpc in deep pencil 
beam surveys, and the height of the acoustic peak in the GMB anisotropy 
spectrum. However the rapidly improving data sets for both GMB and 
galaxy redshift surveys to 2 : ~ 0.2 suggest that it may be unwise to invest 
much effort in exploring current data sets which are of inadequate size and 
are almost certainly contaminated by systematic errors. 

More fundamental issues that merit more theoretical effort include 
galaxy formation. This currently amounts to phenomenology driven by 
observations. One may hope that higher resolution simulations with im- 
proved star formation physics will eventually improve the present situation. 
Another important model issue is the topology of the universe. There are no 
predictions for this quantity but if the topological scale is of the order of the 
curvature scale, as simple arguments suggest, observable global anisotropies 
are generated in the GMB. Even in the area of quantum gravity, observation 
may be ahead of theory. 

References 

[ 1 ] D. Fixsen, G. Hinshaw, C. Bennett and J. Mather, ApJ 486 (1997) 623. 

[2] B. Chaboyer, Phys. Rep, 307 (1998) 23. 

[3] A. Saha et al, ApJ 522 (1999) 802. 

[4] B. Madore et al., ApJ 515 (1999) 29. 

[5] D. Schramm and M. Turner, RMP 70 (1998) 303. 

[6] D. Weinberg et al., ApJ 490 (1997) 564. 

[7] R. Cen and J. Ostriker, ApJ 514 (1999) 1. 

[8] R. Kirkman and D. Tytler, ApJ 489 (1997) L123. 

[9] R. Kirkman and D. Tytler, ApJ 484 (1997) 672. 




J. Silk: Cosmological Parameters and Galaxy Formation 



85 



[10] C. Afonso et a/., AhA 344 (1999) L63. 

[11] F. Tamanaha et al.^ ApJ 358 (1990) 164. 

[12] R. Ibata, H. Richer, R. Gilliland and D. Scott, ApJ (1999) (in press) 
[astro-ph/9908270] . 

[13] R. Ibata et al., ApJL (2000) (in press), preprint [astro-ph/0002138]. 

[14] N. Bahcall and X. Fan, ApJ 504 (1998) 1. 

[15] A. Blanchard and J. Bartlett, A&A 332 (1998) L49. 

[16] A. Dekel et al, ApJ 522 (1999) 1. 

[17] E. Gawiser and J. Silk, Sci 280 (1998) 1405. 

[18] M. Arnaud and A. Evrard, M7VRA5 305 (1999) 631. 

[19] B. Schmidt et al.^ ApJ 507 (1998) 46. 

[20] S. Perlmutter et a/., ApJ 517 (1999) 565. 

[21] A. Riess et a/., AJ (1999) (in press), preprint [astro-ph/9907038]. 

[22] W. Hu and M. White, ApJ 471 (1996) 30. 

[23] S. Dodelson and L. Knox, preprint [astro-ph/9909454]. 

[24] L. Knox and L. Page, preprint [astro-ph/0002162]. 

[25] E. Falco, C. Kochanek and J. Munoz, ApJ 494 (1998) 47. 

[26] M. Bartelmann et al.^ A&A 330 (1998) 1. 

[27] M. Sakellariadou (1999) (in press), preprint [hep-ph/9901393]. 

[28] A. Starobinsky, Grav. Cosmology 4 (1998) 88. 

[29] L. Amendola et al.^ New Astron. 4 (1999) 339. 

[30] M. Giavalisco et al.^ ApJ 503 (1998) 543. 

[31] J. Mohr et a/., ApJ 517 (1999) 627. 

[32] F. Pearce et al, ApJ 521 (1999) L99. 

[33] J. Navarro, C. Frenk and S. White, ApJ 490 (1997) 493. 

[34] H. Lin et al., ApJ 464 (1996) 60L. 

[35] E. Zucca et al, A&A 326 (1997) 477. 

[36] B. Moore et al., ApJ 524 (1999) 19. 

[37] A. Benson et al, MNRAS 311 (2000) 793. 

[38] M. Steinmetz and J. Navarro, ApJ 513 (1999) 555. 

[39] Y. Jing and Y. Suto, ApJ 529L (2000) 69. 

[40] B. Moore et al, MNRAS 310 (1999) 1147. 

[41] J. Navarro and M. Steinmetz, ApJ (1999) (in press), preprint [astro-ph/9908114]. 

[42] J. Dalcanton and R. Bernstein, in Dynamics of Galaxies: From the Early 

Universe to the Present, edited by F. Combes, G. Mamon and Charmandaris, ASP 
Conf. Ser. 197 (2000) 161. 

[43] B. Moore et al, ApJ 499 (1998) L5. 

[44] A. Kravtsov et al, ApJ 502 (1998) 48. 

[45] J. Navarro and M. Steinmetz, ApJ 502 (1998) 48. 

[46] R. Bouwens and J. Silk, ApJ (2000) (in press). 

[47] S. Lilly et al, ApJ (2000) (in press). 

[48] D. Rigopoulou et al, AJ 118 (1999) 2625. 

[49] D. Hughes et al, Nat 394 (1998) 241. 

[50] J. Tan, J. Silk and C. Balland, ApJ 522 (1999) 579. 




COURSE 3 



A SHORT COURSE ON BIG BANG NUCLEOSYNTHESIS 

K.A. OLIVE 



Theoretical Physics Institute, 
School of Physics and Astronomy, 
University of Minnesota, Minneapolis 
MN 55455, U.S.A. 




Contents 



1 Introduction 89 

2 Theory 90 

3 Data 90 

4 Likelihood analyses 93 

5 More data 95 

6 More analysis 96 

7 Chemical evolution 96 

8 Constraints from BBN 97 




A SHORT COURSE ON BIG BANG NUCLEOSYNTHESIS 



K.A. Olive 



Abstract 

A brief review of standard Big Bang nucleosynthesis theory and the 
related observations of the light element isotopes is presented. 

1 Introduction 

The standard model [1] of Big Bang nucleosynthesis (BBN) is based on 
the relatively simple idea of including an extended nuclear network into a 
homogeneous and isotropic cosmology. Apart from the input nuclear cross 
sections, the theory contains only a single parameter, namely the baryon- 
to-photon ratio, rj. Other factors, such as the uncertainties in reaction 
rates, and the neutron mean-life can be treated by standard statistical and 
Monte-Carlo techniques [2-4] . The theory then allows one to make predic- 
tions (with specified uncertainties) of the abundances of the light elements, 
D, ^He, '^He, and ^Li. 

It is interesting to note the role of BBN in the prediction of the mi- 
crowave background [5]. The argument is rather simple. BBN requires 
temperatures greater than 100 keV, which according to the standard model 
time-temperature relation, tsT^^y = 2A/\/N, where N is the number of 
relativistic degrees of freedom at temperature T, corresponds to timescales 
less than about 200 s. The typical cross section for the first link in the 
nucleosynthetic chain is 

av{p + n^D + j)c±5x 10“^° cm^/s. (1.1) 

This implies that it was necessary to achieve a density 

1 10^^ cm-^ (1.2) 

avt 

The density in baryons today is known approximately from the density of 
visible matter to be ubo ~ 10“^ cm“^ and since we know that that the 
density n scales as R~^ ~ T^, the temperature today must be 

To = (nBo/n)^/^TBBN ~ 10 K (1.3) 

thus linking two of the most important tests of the Big Bang theory. 

© EDP Sciences, Springer- Verlag 2000 
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2 Theory 



Conditions for the synthesis of the light elements were attained in the early 
Universe at temperatures T < 1 MeV. At somewhat higher temperatures, 
weak interaction rates were in equilibrium. In particular, the standard 
(3 processes fix the ratio of number densities of neutrons to protons. At 
T > 1 MeV, {n/p) ~ 1. 

As the temperature fell and approached the point where the weak in- 
teraction rates were no longer fast enough to maintain equilibrium, the 
neutron to proton ratio was given approximately by the Boltzmann factor, 
{nip) ~ where Am is the neutron-proton mass difference. After 

freeze-out, free neutron decays drop the ratio slightly. The final abundance 
of "‘He is very sensitive to the {n/p) ratio 



. ^ 2 (n/p) 

" [l + (n/p)] 



0.25. 



( 2 . 1 ) 



Freeze out occurs at slightly less than an MeV resulting in {n/p) ~ 1/6 at 
this time. 

The nucleosynthesis chain begins with the formation of Deuterium by 
the process, p+n D -I- 7 . However, because of the large number of photons 
relative to nucleons, rj~^ = n^/n-B ~ 10^^*^, Deuterium production is delayed 
past the point where the temperature has fallen below the Deuterium bind- 
ing energy, Eb = 2.2 MeV (the average photon energy in a blackbody is 
~ 2.7T). This is because there are many photons in the exponential 
tail of the photon energy distribution with energies E > Eb despite the 
fact that the temperature or E^ is less than Eb- During this delay, the 
neutron-to-proton ratio drops to {n/p) ~ 1/7. 

The dominant product of Big Bang nucleosynthesis is ‘‘He resulting in 
an abundance of close to 25% by mass. Lesser amounts of the other light 
elements are produced: D and '‘He at the level of about 10“® by number, 
and ^Li at the level of 10“‘° by number. The resulting abundances of the 
light elements are shown in Figure 1, over the range in rjiQ = 10“*?7 between 
1 and 10. The curves for the ‘‘He mass fraction, Y, bracket the computed 
range based on the uncertainty of the neutron mean-life which has been 
taken as r„ = 887 ± 2 s. Uncertainties in the produced ’‘"Li abundances have 
been adopted from the results in Hata et al. [3] . Uncertainties in D and ‘He 
production are small on the scale of this figure. The boxes correspond to 
the observed abundances and will be discussed below. 



3 Data 

The primordial ‘He abundance is best determined from observations of 
Hell ^ Hel recombination lines in extragalactic HII (ionized hydrogen) 
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Fig. 1. The light element abundances from Big Bang nucleosynthesis as a function 
of ?7io. 



regions. There is now a good collection of abundance information on the 
"‘He mass fraction, Y, 0/H, and N/H in over 70 [6,7] such regions. Since "‘He 
is produced in stars along with heavier elements such as Oxygen, it is then 
expected that the primordial abundance of "‘He can be determined from the 
intercept of the correlation between Y and O /H, namely Fp = F(0 /H ^ 0) . 
A detailed analysis of the data including that in [7] found an intercept cor- 
responding to a primordial abundance Fp = 0.234 ± 0.002 ± 0.005 [8]. This 
was updated to include the most recent results of [9] in [10]. The result 
(which is used in the discussion below) is 

Fp = 0.238 ±0.002 ± 0.005. (3.1) 

The first uncertainty is purely statistical and the second uncertainty is an 
estimate of the systematic uncertainty in the primordial abundance deter- 
mination [8]. The solid box for '‘He in Figure 1 represents the range (at 
2(Tstat) from (3.1). The dashed box extends this by including the systematic 
uncertainty. The ‘He data is shown in Figure 2. Here one sees the corre- 
lation of ‘He with O /H and the linear regression which leads to primordial 
abundance given in equation (3.1). 

The abundance of ‘’Li has been determined by observations of over 
100 hot, population-!! stars, and is found to have a very nearly uniform 
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Fig. 2. The Helium (Y) and Oxygen (O/H) abundances in extragalactic HIT 
regions, from [6] and from [9]. Lines connect the same regions observed by different 
groups. The regression shown leads to the primordial ^He abundance given in 
equation (3.1). 



abundance [11]. For stars with a surface temperature T > 5500 K and a 
metallicity less than about l/20th solar (so that effects such as stellar con- 
vection may not be important), the abundances show little or no dispersion 
beyond that which is consistent with the errors of individual measurements. 
The Li data from [12] is shown in Figure 3. Indeed, as detailed in [12, 13], 
much of the work concerning ^Li has to do with the presence or absence of 
dispersion and whether or not there is in fact some tiny slope to a [Li] = 
log ^Li/H + 12 vs. T, or [Li] vs. [Fe/H] relationship ([Fe/H] is the log of 
the Fe/H ratio relative to the solar value). 

I will use the value given in [12] as the best estimate for the mean ^Li 
abundance and its statistical uncertainty in halo stars 

Li/H= (1.6±0.1) X 10"^°. (3.2) 

The small error is is statistical and is due to the large number of stars in 
which ^Li has been observed. The solid box for ^Li in Figure 1 represents 
the 2cTstat range from (3.2). There is, however, an important source of 
systematic error due to the possibility that Li has been depleted in these 
stars, though the lack of dispersion in the Li data limits the amount of 
depletion. In fact, a small observed slope in Li vs. Fe [13], and the tiny 
dispersion about that correlation indicates that depletion is negligible in 
these stars. Furthermore, the slope may indicate a lower abundance of Li 
than that in (3.2),Li/H~ 1.2x10“^° [14]. The observation [15] of the fragile 
isotope ®Li is another good indication that ^Li has not been destroyed in 
these stars [16]. 
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Fig. 3. The Li abundance in halo stars with [Fe/H] < —1.3, as a function of 
surface temperature. The dashed line shows the value of the weighted mean of 
the plateau data. 



Aside from the Big Bang, Li is produced together with Be and B in 
accelerated particle interactions such as cosmic ray spallation of C, N, O by 
protons and a-particles. Li is also produced by a — a fusion. Be and B have 
been observed in these same pop II stars and in particular there are a dozen 
or so stars in which both Be and ^Li have been observed. Thus Be (and B 
though there is still a paucity of data) can be used as a consistency check 
on primordial Li [17]. Based on the Be abundance found in these stars, 
one can conclude that no more than 10 — 20% of the ^Li is due to cosmic 
ray nucleosynthesis leaving the remainder (an abundance near 10“^*^) as 
primordial. This is consistent with the conclusion reached in [14]. The 
dashed box in Figure 1, accounts for the possibility that as much as half of 
the primordial ^Li has been destroyed in stars, and that as much as 20% 
of the observed ^Li may have been produced in cosmic ray collisions rather 
than in the Big Bang. The former uncertainty is probably an overestimate. 

4 Likelihood analyses 

At this point, having established the primordial abundance of at least two 
of the light elements, "‘He and ’‘"Li, with reasonable certainty, it is possible 
to test the concordance of BBN theory with observations. Two elements 
are sufficient for not only constraining the one parameter theory of BBN, 
but also for testing for consistency [18]. A theoretical likelihood function 
for ^He can be defined as 

LBBN(T,yBBN) = 



(4.1) 
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where hBBN(? 7 ) is the central value for the ^He mass fraction produced in 
the Big Bang as predicted by the theory at a given value of 77 . cti is the 
uncertainty in that value derived from the Monte-Carlo calculations [3] and 
is a measure of the theoretical uncertainty in the Big Bang calculation. 
Similarly one can write down an expression for the observational likelihood 
function. Assuming Gaussian errors, the likelihood function for the obser- 
vations would take a form similar to that in (4.1). 

A total likelihood function for each value of rj is derived by convolving 
the theoretical and observational distributions, which for ^He is given by 



L^\otM) = j dULBBN {Y, Ybbn m Lo(Y, Yq). (4.2) 



An analogous calculation is performed [18] for ^Li. The resulting likelihood 
functions from the observed abundances given in equations (3.1) and (3.2) 
is shown in Figure 4. As one can see there is very good agreement between 
"‘He and ^Li in the range of 7710 — 1.5 - 5.0. The double peaked nature 
of the ^Li likelihood function is due to the presence of a minimum in the 
predicted lithium abundance. For a given observed value of ^Li, there are 
two likely values of rj. 



* s 




Fig. 4 . Likelihood distribution for each of '*He and ^Li, shown as a function of 77. 



The combined likelihood, for fitting both elements simultaneously, is 
given by the product of the two functions in Figure 4. The 95% CL region 
covers the range 1.55 < 7710 < 4.45, with the two peaks occurring at 7710 = 
1.9 and 3.5. This range corresponds to values of Ub between 



0.006 < flBh^ < .016. 



(4.3) 



K.A. Olive: Big Bang Nucleosynthesis 



95 



5 More data 

Because there are no known astrophysical sites for the production of 
Deuterium, all observed D is assumed to be primordial. As a result, any 
firm determination of a Deuterium abundance establishes an upper bound 
on rj which is robust. 

Deuterium abundance information is available from several astrophysical 
environments, each corresponding to a different evolutionary epoch. In 
the ISM, corresponding to the present epoch, we have [19] {D/H)ism — 
1.60 ± 0.09. This measurement allow us to set the upper limit to ryio < 9 
and is shown by the lower right of the solid box in Figure 1. There are 
however, serious questions regarding homogeneity of this value in the ISM. 
There may be evidence for considerable dispersion in D/H [20] as is the 
case with ^He [21]. There is also a solar abundance measurement of D/H 
which indicates that [22, 23] {D/H)q « (2.6 ± 0.6 ±1.4) x 10“® This value 
for presolar D/H is consistent with measurements of surface abundances of 
HD on Jupiter D/H = 2.7 ± 0.7 x 10"5 [24]. 

Finally, there have been several reported measurements of D/H in high 
redshift quasar absorption systems. Such measurements are in principle 
capable of determining the primordial value for D/H and hence rj, because 
of the strong and monotonic dependence of D/H onrj. However, at present, 
detections of D/H using quasar absorption systems do not yield a conclusive 
value for D/H. The first of these measurements [25] indicated a rather high 
D/H ratio, D/H « 1.9 — 2.5 x 10“^. Though see [26]. More recently, a 
similarly high value of D/H = 2.0 ± 0.5 x 10“'^ was reported in a relatively 
low redshift system (making it less suspect to interloper problems) [27]. This 
was confirmed in [28] where a 95% CL lower bound to D/H was reported 
as 8 X 10“®. In contrast, other quasar absorption systems show low values 
of D/H = 3.4 ± 0.3 X 10“^ [29]. However, it was also noted [30] that when 
using mesoturbulent models to account for the velocity field structure in 
these systems, the abundance may be somewhat higher (3.5 — 5 x 10“®). 
This may be quite significant, since at the upper end of this range (5 x 10“®) 
all of the element abundances are consistent as will be discussed shortly. 
The upper range of quasar absorber D/H is shown by the dashed box in 
Figure 1. 

There are also several types of ®He measurements. The meteoritic ex- 
tractions yield a presolar value for ®He/H = 1.5 xl0“®. In addition, there 
are several ISM measurements of ®He in galactic HII regions [21] which 
show a wide dispersion which may be indicative of pollution or a bias [31] 
(®He/H)jjjj ~ 1 — 5 X 10“®. There is also a recent ISM measurement of 
®He [32] with (®He/H)jgj^ = 2.1l'l g x 10“®. Finally there are observations 
of ®He in planetary nebulae [33] which show a very high ®He abundance 
of ®He/H ~ 10“®. None of these determinations represent the primordial 
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^He abundance, and as will be discussed below, their relation to the primor- 
dial abundance is heavily dependent on both stellar and chemical evolution. 

6 More analysis 

It is interesting to compare the results from the likelihood functions of "‘He 
and ^Li with that of D/H. To include D/H, one would proceed in much 
the same way as with the other two light elements. We compute likelihood 
functions for the BBN predictions as in equation (4.1) and the likelihood 
function for the observations. These are then convolved as in equation (4.2). 

Using D/H = (2.0 ± 0.5) x lO”"* we would find excellent agreement 
between "‘He, ^Li and D/H and predict for ij, r]io = corresponding 

to The absence of any overlap with the high - 77 peak of the 

^Li distribution has considerably lowered the upper limit to 77 . Overall, the 
concordance limits in this case are dominated by the Deuterium likelihood 
function. 

If instead, we assume that the low value [29] of D/H = (3.4±0.3) x 10“® 
is the primordial abundance, there is hardly any overlap between the D and 
the ^Li and "‘He distributions. In this case, D/H is just compatible (at 
the 2 a level) with the other light elements, and the peak of the likelihood 
function occurs at 7710 = d.Sto^g- 

It is important to recall however, that the true uncertainty in the low 
D/H systems might be somewhat larger. If we allow D/H to be as large as 
5 X 10“®, the peak of the D/H likelihood function shifts down to 7710 — 4. 
In this case, there would be a near perfect overlap with the high 77 ^Li peak 
and since the ^He distribution function is very broad, this would be a highly 
compatible solution. 

7 Chemical evolution 

Because we can not directly measure the primordial abundances of any of 
the light element isotopes, we are required to make some assumptions con- 
cerning the evolution of these isotopes. As has been discussed above, "‘He 
is produced in stars along with Oxygen and Nitrogen. ^Li can be destroyed 
in stars and produced in several (though still uncertain) environments. D 
is totally destroyed in the star formation process and ^He is both produced 
and destroyed in stars with fairly uncertain yields. It is therefore preferable, 
if possible to observe the light element isotopes in a low metallicity envi- 
ronment. Such is the case with "‘He and ^Li. These elements are observed 
in environments which are as low as l/50th and 1/lOOOth solar metallicity 
respectively and we can be fairly assured that the abundance determina- 
tions of these isotopes are close to primordial. If the quasar absorption 
system measurements of D/H stabilize, then this too may be very close 
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to a primordial measurement. Otherwise, to match the solar and present 
abundances of D and ^He to their primordial values requires a model of 
galactic chemical evolution. 

While it is beyond the scope of this lecture to discuss the details of chem- 
ical evolutionary models, it is worth noting that Deuterium is predicted to 
be a monotonically decreasing function of time. The degree to which D is 
destroyed, is however a model dependent question. The evolution of ^He is 
even more complicated. Stellar models predict that substantial amounts of 
^He are produced in stars between 1 and 3 Mq. For example, in the models 
of Iben and Truran [34] a 1 Mq star will yield a ^He abundance which is 
nearly three times as large as its initial (D-|-^He) abundance. It should be 
emphasized that this prediction is in fact consistent with the observation 
of high ^He/H in planetary nebulae [33]. However, the implementation of 
standard model ^He yields in chemical evolution models leads to an over- 
production of ^He/H particularly at the solar epoch [31,35]. While the 
overproduction is problematic for any initial value of D/H, it is partic- 
ularly bad in models with a high primordial D/H. In Scully et al. [36], 
models of galactic chemical evolution were proposed with the aim of re- 
ducing a primordial D/H abundance of 2 x 10“^ to the present ISM value 
without overproducing heavy elements and remaining consistent with the 
other observational constraints typically imposed on such models. 

The overproduction of ^He relative to the solar meteoritic value seems 
to be a generic feature of chemical evolution models when ^He production 
in low mass stars is included. This result appears to be independent of 
the chemical evolution model and is directly related to the assumed stellar 
yields of ^He. It has recently been suggested that at least some low mass 
stars may indeed be net destroyers of ^He if one includes the effects of extra 
mixing below the conventional convection zone in low mass stars on the 
red giant branch [37, 38] . The extra mixing does not take place for stars 
which do not undergo a Helium core flash {i.e. stars > 1.7 — 2 Mq). Thus 
stars with masses less than 1.7 Mq are responsible for the ^He destruction. 
Using the yields of Boothroyd and Malaney [38], it was shown [39] that 
these reduced ^He yields in low mass stars can account for the relatively 
low solar and present day ^He/H abundances observed. In fact, in some 
cases, ^He was underproduced. To account for the ^He evolution and the 
fact that some low mass stars must be producers of ^He as indicated by the 
planetary nebulae data, it was suggested that the new yields apply only to 
a fraction (albeit large) of low mass stars [39,40]. 

8 Constraints from BBN 

Limits on particle physics beyond the standard model are mostly sensi- 
tive to the bounds imposed on the '^He abundance. As discussed earlier. 
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the neutron-to-proton ratio is fixed by its equilibrium value at the freeze- 
out of the weak interaction rates at a temperature Tf ~ 1 MeV modulo the 
occasional free neutron decay. Furthermore, freeze-out is determined by the 
competition between the weak interaction rates and the expansion rate of 
the Universe 

Gp'Tf^ ~ rwk(Tf) = iF(Tf) - (8.1) 

where N counts the total (equivalent) number of relativistic particle species. 
At T ~ 1 MeV, N = 43/4. The presence of additional neutrino flavors (or 
any other relativistic species) at the time of nucleosynthesis increases the 
overall energy density of the Universe and hence the expansion rate leading 
to a larger value of Tf, (n/p), and ultimately Up. Because of the form of 
equation (8.1) it is clear that just as one can place limits [41] on N, any 
changes in the weak or gravitational coupling constants can be similarly 
constrained (for a recent discussion see [42]). 

One can parameterize the dependence of Y on by 

Y = 0.2262 -h 0.0131(A^i, -3) + 0.0135 Inr/io (8.2) 

in the vicinity of r]io ~ 2. Equation (8.2) also shows the weak (log) depen- 
dence on rj. However, rather than use (8.2) to obtain a limit, it is preferable 
to use the likelihood method. 

Just as "^He and ^Li were sufficient to determine a value for p, a limit on 
Ni, can be obtained as well [4,18,43,44]. The likelihood approach utilized 
above can be extended to include N^, as a free parameter. Since the light 
element abundances can be computed as functions of both p and N^, the 
likelihood function can be defined by [43] replacing the quantity Tbbn (v) in 
equation (4.1) with Vbbn to obtain L Again, similar 

expressions are needed for ^Li and D. 

The peaks of the distribution as well as the allowed ranges of p and 
are easily discerned in the contour plots of Figures 5 and 6 which show the 
50%, 68% and 95% confidence level contours in T47 and T247 projected onto 
the p — plane, for high and low D/H a,s indicated. The crosses show the 
location of the peaks of the likelihood functions. L47 peaks at = 3.2, 
pio = 1.85 and at N^, = 2.6, 7710 = 3.6. The 95% confidence level allows the 
following ranges in p and 

1.7<N^<4.5 1.4 < ?7io < 4.9. (8.3) 

Note however that the ranges in p and are strongly correlated as is 
evident in Figure 5. 

With high D/H, L247 peaks at = 3.3, and also at 7710 = 1.85. In this 
case the 95% contour gives the ranges 

2.2 <N,,< 4.4 1.4 < ?7io < 2.4. (8.4) 
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Fig. 5. 50%, 68 % & 95% C.L. contours of L 47 and L 247 where observed 
abundances are given by equations (3.1) and (3.2), and high D/H. 




Fig. 6. 50%, 68 % & 95% C.L. contours of L 47 and L 247 where observed 
abundances are given by equations (3.1) and (3.2), and low D/H. 
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Note that within the 95% CL range, there is also a small area with rjio = 
3.2- 3.5 and = 2.5 - 2.9. 

Similarly, for low D/H, L 247 peaks at = 2.4, and 7710 = 4.55. The 
95% CL upper limit is now N^, < 3.2, and the range for 77 is 3.9 < 7710 < 5.4. 
It is important to stress that with the increase in the determined value of 
D/H [29] in the low D/H systems, these abundances are now consistent 
with the standard model value of N^, = 3 at the 2 a level. 
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THE COSMIC MICROWAVE BACKGROUND: 
FROM DETECTOR SIGNALS TO CONSTRAINTS 
ON THE EARLY UNIVERSE PHYSICS 



F.R. Bouchet^, J.-L. Puget^ and J.M. Lamarre^ 



Abstract 

These lecture notes discuss observations of the Cosmic Microwave 
Background (CMB). The theory of primary CMB anisotropies has 
been extensively covered in a recent Les Houches summer school; 
it is not developed here. The notes concentrate mostly on the ob- 
servational situation and on future missions, with special emphasis 
on Planck which carries CMB instruments of the third generation, 
namely cryogenically cooled ones. We discuss rather extensively fore- 
grounds and techniques to remove systematic effects from the data 
and compress it in steps, from time-ordered data from individual de- 
tectors to maps at several frequencies, then to a single map of the 
CMB anisotropies and finally to it’s power spectrum and cosmologi- 
cal parameters. 

1 Introduction 

The Cosmic Microwave Background is the only really diffuse component of 
the isotropic backgrounds observed from radio to gamma rays and it domi- 
nates the radiation content of the universe. Its very nearly pure Planckian 
spectrum, in agreement with the standard and simplest big bang model, is 
an essential tool for observational cosmology. It has been shown that high 
accuracy observations of its spectrum and of its spatial structure are the 
best observational tool both for the determination of the global cosmolog- 
ical parameters and to constrain observationally the physics of the early 
universe. 

These lecture notes describe the present situation of observations of the 
CMB spectrum and intensity anisotropies and the strategy for future mea- 
surements of these, both in term of new instruments and data analysis. 
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The theory of CMB anisotropies is described in great details in the 
lecture notes of the 1996 les Houches school by Bond [22]. The reader will 
be referred to those when needed. 

Polarisation of the CMB will also be an important aspect of the third 
generation CMB experiments. Nevertheless the limitations of CMB polar- 
isation measurements are difficult to assess precisely today. Summarising 
the detailed work done in the last few years on CMB intensity anisotropy 
measurements and on the removal of systematic effects, and of foregrounds, 
makes a large fraction of these lecture notes. Similar work on the po- 
larisation measurements is under way but is far from being completed. 
In consequence these lecture notes contain little material on polarisation 
measurements. 

A new generation of instruments limited essentially by photon noise is 
described. It relies on new technologies: sub-Kelvin cooled bolometers with 
total power read out electronics, hydrogen sorption coolers, space qualified 
dilution cooler. 

We describe how to go from raw data to a sky map of the tempera- 
ture fluctuations which covers all the useful spatial frequencies for CMB 
cosmological work. This map is the basic tool to constrain cosmological 
parameters and physics of the early universe. One of its most important 
characteristics is its angular power spectrum. 

2 The cosmic background 

2.1 Components of the cosmic background 

The cosmic background is the present electromagnetic content of the uni- 
verse averaged over a large volume. Like the other average quantities char- 
acterising the present content of the universe this is an essential tool for 
physical cosmology although it is not easy to measure. Observationally it 
is defined as the isotropic background observed when one has removed the 
quasi isotropic components radiated in our Galaxy or in the solar system 
and of course all instrumental straylight. 

The Spectral Energy Distribution (hereafter SED) of this background 
is shown in Figure 1; it is dominated by a microwave component which 
contains 93% of the energy. It has a spectrum very close to a Planck function 
(see Sect. 2.3) and it is the only one to be clearly diffuse. It is the main 
object of this course. 

Other components of the cosmic background have been observed for a 
long time. In the radio range a background, now known to be the sum 
of extragalactic radio sources, was the first component of the cosmic back- 
ground to be observed. It contains only a very small fraction of the energy 
(1.1 X 10-®, see Tab. 1). 
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Fig. 1. Cosmic background spectrum from radio to gamma rays, as summarised 
by Halpern and Scott in 1999. The intensity is shown per logarithmic frequency 
interval, so that the energy in different bands can be directly compared. 

Reprinted from [76]. 



Similarly, the X-ray background was discovered quite early by Giacconi 
et al. [64] and is now very well measured. The Gamma ray cosmic back- 
ground is also detected in the range 0.1 MeV to more than 1 GeV. The 
X-ray background SED peaks around 30 keV. At lower energies, Basinger 
et al. [79] show that 75 to 80% of the intensity of the background was broken 
into individual sources and this fraction will probably significantly increase 
with observations carried out with the Ghandra observatory. Models of 
the X-ray background assume it is mostly made of the emission of extinct 





110 



The Primordial Universe 



Table 1. The energy distribution in the various components of the cosmic 
background. 



Frequency range 


Intensity 
[ W m“^ sr“^ ] 


Fraction of cosmic background 


Radio 


1.2 X I0"i^ 


I.I X I0"“ 


CMB 


9.96 X 10- '' 


0.93 


Infrared 


4-5.2 X I0"« 


0.04-0.05 


Optical 


2-4 X I0"« 


0.02-0.04 


X-rays 


2.7 X IQ-i'J 


2.5 X I0"'‘ 


Gamma rays 


3 X 10-11 


2.5 X 10"^ 



Active Galactic Nuclei (AGNs). The high energy tail extending in the 
gamma ray range is likely to be of a similar nature. 

Another component of the cosmic background made of the integrated 
radiation from all galaxies in the ultraviolet, optical and infrared, has been 
predicted for a long time [121]. In the optical and near infrared it contains 
the redshifted stellar radiation not absorbed by dust. The absorbed energy 
is re-radiated in the far infrared. The main energy source is expected to be 
the nucleosynthesis of heavy elements in stars. 

It also contains the ultraviolet/optical radiation from the AGNs which 
probably draw their energy from accretion onto massive black holes in the 
centre of these galaxies. A large fraction of the energy radiated in the far ul- 
traviolet and soft X-rays is very efficiently absorbed by dust and re-radiated 
in the far infrared. The fraction of the energy coming out in the hard 
X-rays escapes absorption and is seen directly as the cosmic background 
above about 30 keV. 

The energy from these two processes is thus distributed between i) the 
optical and near infrared wavelength range which is dominated by stellar 
radiation and ii) the re-radiated part in the far infrared and submillimetre 
range which comes from nucleosynthesis in stars and from accretion onto 
black holes in AGNs; their relative contribution is currently unknown. For 
definiteness, we shall use 6 x 10^^ Hz as the limit between direct and dust 
reradiated radiation. 

These two components of the cosmic background were still not detected 
until 5 years ago. The combination of observations with the Cosmic 
Background Explorer (COBE) satellite and of deep surveys in the 
optical and near infrared with the Hubble Space Telescope (HST), the 
Infrared Space Observatory (ISO) and large ground based telescopes has 
lead to a good measurement of the cosmic background in the far infrared 
and submillimetre range and lower limits in the mid infrared, near infrared 
and optical from deep surveys of extragalactic sources. These surveys have 
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Fig. 2. The cosmic background spectrum from ultra-violet to submillimetre after 
Gispert et al. (2000). The two dotted lines connect upper and lower limits and 
show the range allowed by the observations. Reprinted from [65]. 



good enough sensitivities to lead to good estimates of the integrated bright- 
ness of these sources. In the ultraviolet and near infrared, upper limits of 
the integrated diffuse emission have been obtained from COBE and Rocket 
measurements. Finally, the propagation of ultra high energy gamma rays 
from Mkn 501 and Mkn 421 has allowed upper limits to be put on the 
mid infrared cosmic background about a factor of 2 above the lower limit 
obtained from the ISO cosmological counts at 15 ^m. These limits con- 
strain rather well the cosmic background in a spectral region where direct 
measurements are impossible due to the strong zodiacal cloud brightness. 

It should be noted that the uncertainties in the cosmic background are 
at present larger in the optical range than in the infrared. Taking this into 
account the ratio of energy in the thermal infrared to the energy in the 
optical-UV is between 1 and 2.5. 

The energy distribution of the cosmic background between the different 
wavelength ranges is given in Table 1. 
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2.2 Formation of the CMB, recombination 



In the early universe, matter and radiation are in quasi-perfect thermal 
equilibrium. Two time scales are important in this respect. Elastic scat- 
tering of photons by free electrons through Thomson scattering has a mean 
free path given by 



A 



scatt 



1 

noa;e(l -I- zYa^ 



7.5 X lO^o 
Xe(l -I- ZY 



cm, 



( 2 . 1 ) 



where rioXe is the density of free electrons, and ctt stands for the Thomson 
cross-section. The second important one is the probability for absorption 
(or emission) of photons through free-free interactions which provides full 
thermalisation to a Planckian spectrum for 2 > 3 x 10^. Comparing the 
Thomson mean free path with the horizon H{z) as a function of redshift: 



Ascatt 8-3 

H{z) XeY + z) 



( 2 . 2 ) 



it is clear that the universe, which is fully ionised at z = 1100, is opaque in 
this redshift range. When the temperature in the universe becomes smaller 
than about 3000 K, the cosmic plasma recombines and the ionisation rate 
Xe falls from 1 at z > 1000 down to Xe < 10“^ at z < 1000. The universe be- 
comes thus transparent to background photons over a narrow redshift range 
of 100 or less. Photons will then propagate freely as long as galaxies and 
quasars do not reionise the universe. The Thomson optical depth between 
this recombination redshift Zrec and the present time is for the Euclidean 
case: 



T, = 0.01x (1 + Zrec)^/". (2.3) 

For the standard cosmological model, z^ec — 10; it is thus clear that the cos- 
mic background at redshift 1000 can be observed with only small secondary 
distortions. 

2.3 The CMB spectrum 

It is rather remarkable that the temperature of the main component of 
the cosmic background can be computed from basics physics using only 
two cosmological observables. Alpher et al. [8] showed that the chemical 
elements other than hydrogen, which cannot be explained by nucleosynthesis 
in stars, could have been formed in a hot big bang. Nevertheless they did 
not take into account the fact that radiation dominates over matter at that 
time. This was taken into account by Gamow [60,61], but the first correct 
calculation is given by Alpher and Herman [9] ; they predicted a value around 
5 K for the cosmic background radiation temperature. 
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The argument goes as follows. To explain the large amount of Helium 
observed today (about 25%) Deuterium should be synthesised first. This 
can take place only when the cosmic plasma is in a rather narrow range of 
temperatures, high enough for the fusion reaction to take place, and low 
enough so Deuterium nuclei are not photo-dissociated. This temperature 
is T ~ 10® K. The dynamics of the universe being dominated by radiation 
at this temperature, its expansion rate is then fixed by general relativity 
and the expansion time is 200 seconds. It is thus easy to compute the 
baryon density needed at this time which would lead to the synthesis of a 
substantial fraction of the mass into Deuterium (thus then into Helium), 

texp{T = 10®) X riB X (ctp„_d X u) ~ 1, (2.4) 

where (Jpn-D stands for the Deuterium production cross section which is 
at this temperature of the order of 10“® mb, and v is the baryon velocity 
dispersion at T ~ 10® K; this implies a baryon density ne — 10^® cm“®. 

Knowing the present density in the universe (~ 10“^ cm“®) the expan- 
sion factor since primordial nucleosynthesis 1 -I- 2 ns is then given by 

l-k2NS=f^j -2x10®. (2.5) 

This lead to a prediction of the temperature of the black body radiation 
content of the universe today: 

10 ® 

Tbb = — 5 K. (2.6) 

1 + 2ns 

This remarkable prediction based only on the helium fraction and a rough 
estimate of the present baryon density, was spectacularly confirmed in 1965 
when Penzias and Wilson [122] announced “A Measurement of Excess 
Antenna Temperature at 4080 Mc/s” which was interpreted in the same 
journal issue by Dicke et al. [47] as the CMB with a temperature of 3.5±1 K. 

In the simplest hot big bang model, there is no energy released in the 
radiation between 2 — 10® (electron-positron annihilation) and 2—10 (first 
galaxies and quasars). Thus the background radiation which decouples at 
2 = 1000 should be very close to a Planck function. This prediction was 
much more difficult to test observationally than the existence of the back- 
ground. It took several decades before balloon borne experiments by Woody 
and Richards [166-168] could show that the spectrum of the CMB had a 
maximum at a frequency around 3 x 10^^ Hz as it should be if the intensity 
measured at centimetre wavelengths was the Rayleigh Jeans part of a Planck 
function with a temperature around 3 K. This was a very important result 
but, in such a balloon borne experiment, systematic effects were such that 
deviations from a Planck function could not be assessed with an accuracy 
better than about 30%. 
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This situation is very reminiscent of the present one with respect to the 
measurements of the anisotropies of the CMB on small scale as will be seen 
below. 

It took another 12 years before the COBE satellite made an extremely 
accurate check of the Planckian nature of the CMB spectrum as shown in 
Figure 3. 

Two types of distortions can affect the CMB. Energy released in the 
cosmic plasma at rather low redshifts (typically after recombination) heats 
the plasma at a temperature higher than the background radiation temper- 
ature. Compton scattering of photons on the hot electrons tends to shift, 
on average, the photon spectrum towards higher energies without changing 
the photon number. This distortion is referred to as a Compton distortion. 
With respect to the initial Planck spectrum, the spectrum is depleted in the 
Rayleigh Jeans part and boosted in the Wien part. For a non relativistic 
plasma, it is characterised by a single parameter y which is the integrated 
Compton optical depth 



y = J neaTdl. (2.7) 

A second type of distortion appears for energy released before reionisation 
but at times such that thermalisation of the energy between plasma and 
radiation takes place but too late for relaxing to a Planckian spectrum. 
This leaves a Bose-Einstein spectrum characterised by a non zero chemical 
potential /i. 

The FIRAS sky spectra which are a function of frequency and position, 
have been separated by Fixsen et al. [57] into a Planckian monopole spec- 
trum, a dipole (associated with the motion of our Galaxy with respect to 
the CMB) of known spectrum but for which the amplitude and the direction 
were obtained from the FIRAS data, and a galactic component of unknown 
spectrum but assumed spatial distribution (two templates were used with- 
out changing significantly the CMB results). The results are displayed in 
Figure 4. Furthermore the residual anisotropies were correlated with the 
DMR anisotropies and their spectrum is also displayed in Figure 4. This 
spectrum is in good agreement with the one expected if the DMR 
anisotropies are ST /T fluctuations. As concluded by Fixsen et al. [59]: “this 
strongly suggests that the anisotropy observed by DMR, and corroborated by 
FIRAS, is due to temperature variations in the CMB”. 

The analysis of the spectrum of the residuals averaged over the good 
sky gives very tight upper limits for these two parameters measuring the 
expected deviations from a Planck spectrum: 

y < 2.5 X 10"® 

PL < 3.3 X 10"^ 



( 2 . 8 ) 

(2.9) 
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Fig. 3. (a) Experimental determinations of the Spectrum of the CMB compared 
with a Planck function at 2.736 K shown as a thin line, (b) Deviations of the 
data from the best fit. This constraints tighly allowed energy releases at redshift 
below the thermalisation epoch at 2 ~ 10^, see text and Figure 5. Reprinted from 
Fixsen et al. (1996) and Smoot and Scott (1997) [57,147]. 
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Fig. 4. (a) FIRAS data points with 200 a error bars(!), together with a 2.728 K 
Planck spectrum (solid line), (b) FIRAS dipole spectrum together with a curve 
representing a 3.36 mK differential Planck spectrum, (c) Spectra of the correlated 
part with Galactic templates given by DIRBE at 240 and 140 fim. The discrete 
points show the sum of the Galactic components in FIRAS. (d) Spectrum of the 
residual (Dipole and galaxy subtracted) FIRAS fluctuations correlated with the 
DMR ones. The curve is the 35 /iK differential spectrum predicted by DMR, and 
the dashed line is 3.6% of the Galactic spectrum in (c). Reprinted from [59]. 
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95% C.L. Limits To Energy Release 




Fig. 5. Upper limit (from Smoot and Scott 1997) to the energy release allowed 
by the FIRAS spectrum as a function of redshift. Reprinted from [147]. 



These (3 a) limits constrain very strongly the maximum energy released in 
the universe around recombination: AE/E < 1 x 10“^. This is shown in 
Figure 5. 

We have seen that the components of the cosmic background outside 
the microwave range contain less than 7% of the energy and is likely to 
be dominated by sources produced at rather low redshifts. The Planckian 
nature of the bulk of the diffuse part of the cosmic background, which is 
the CMB, has important implications. It tells us that very little energy 
release took place in the universe between 2 = 10^ and the formation of the 
first extragalactic sources we see today. Thus the CMB we observe today is 
simply the redshifted background which was present after recombination at 
2 = 1000 with a very small amount of secondary distortions. Furthermore, 
the background at this time resulted from well understood interactions with 
the other constituents of the universe (baryonic matter through free-free and 
Compton interactions and dark matter and neutrinos through gravity). 

In summary, following the very accurate check of the Planckian spectrum 
of the CMB, which was a prediction of the simplest big bang model, we are 
left again with a very definite prediction of this model which brings together 
the thermal history and the theory of structure generation in the universe. 





118 



The Primordial Universe 



In this model, the physics of the early universe generates the seed 
fluctuations which give rise to all the structures we see today through the 
development of gravitational instability. Besides the Planckian spectrum of 
the background at the time of recombination, this model relates through well 
understood linear physics, close to thermal equilibrium, the power spectrum 
of the inhomogeneities emerging from the early universe to the spectrum of 
inhomogeneities present after recombination at z = 1000: the so-called pri- 
mary anisotropies. 

This is summarised in the next section. 



3 CMB anisotropies 

3. 1 Primary anisotropies 

3.1.1 Fundamental physics and CMB anisotropies 

In the standard cosmological model, the origin of structures in the universe 
is taking roots in the early universe. Inflation is a mechanism which can 
bring quantum fluctuations to macroscopic scales. Phase transitions in the 
early universe can also generate topological defects that can contribute to 
structure formation. Nevertheless, the present CMB anisotropies data do 
not allow these to be the dominant source for structure formation. 

Starting from a spectrum of fluctuations emerging from the early uni- 
verse after the inflation period, and for a given set of cosmological pa- 
rameters the development of perturbations can be fully computed for the 
different components: radiation, neutrinos, baryons, cold dark matter. At 
the time of recombination, the power spectrum of spatial distribution of 
these components is thus precisely predicted for any such model. 

Adiabatic perturbations develop under their own gravity when they are 
much larger than the horizon. After they enter the horizon, they become 
gravitationally stable and oscillate. All perturbations of the same scale are 
in phase as they were all laid down and started to develop at the same time. 
These will lead to the so called acoustic peaks in the resulting CMB power 
spectrum. 

In the case of active source of fluctuations like in defects theories, per- 
turbations are laid down at all times, perturbations are generically isocur- 
vature, non-Gaussian, and the phases of the perturbations of a given scale 
are incoherent, which does not lead to many oscillation like in the coherent 
{e.g. inflationary) case. A single broad peak is expected, and this is why 
current CMB anisotropy data already indicates that defects cannot be the 
sole source of fluctuations. 
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3.1.2 The components of the primary fluctuations 

At z = 1000 the universe becomes transparent and the radiation propagates 
with little interactions. The basic physical mechanisms giving rise to CMB 
anisotropies we observe today are: 

• the radiation temperature has fluctuations associated with those of 
the matter components; 

• the radiation leaves the last scattering surface at locations with dif- 
ferent gravitational potentials depending of the sign and amplitude of 
the density perturbations around this point. The associated change 
of frequency between this point and the observer is the Sachs- Wolfe 
effect; 

• the radiation undergoes a last scattering with matter in the last scat- 
tering surface which has a radial velocity in the potential wells of the 
density fluctuations. As mentioned above, the amplitude of the veloc- 
ity fluctuations encountered will be scale dependent as all fluctuations 
of the same size are in phase with each other. 

These three effects combined built the primary CMB anisotropies at recom- 
bination. They all induce temperature perturbations on the CMB spectrum. 

The development of perturbations reaching the non linear regime com- 
plicate somewhat the problem. The same physics (interaction of the photons 
propagating from the last scattering surface to the observer through a vari- 
able gravitational potential) and after reionisation Compton collisions with 
the electrons induce also temperature variations of the CMB and modifica- 
tions of the primary ones. They are called secondary anisotropies and are 
described in Section 3.2. 

We refer the reader to the Les Houches lecture notes of Bond [22] for 
a very complete description of the theory of CMB primary and secondary 
anisotropies power spectrum formation. 

3.1.3 Power spectrum of the fluctuations in an inflationary model 

Before we go on, let us first establish notations. The amplitude of any scalar 
held (in particular the temperature anisotropy pattern) on the sphere, in 
the direction of a unit vector e = (0, </>), A(e), can be decomposed into 
spherical harmonics, Wm as 

A{6, (/)) = ^ aim Yim{d, 4>)- (3.1) 

The multipole moments, aim, are independent for a statistically isotropic 
held, i.e. {cijm} ~ Ci Sit 6mm' (we denote by {Y) the average over an 
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ensemble of statistical realisations of the quantity Y . The function Cf, is 
called the angular power spectrum; it is related to the two-point correlation 
function of the field, 

?(cos(6»)) = • ^ 2 ) = ( 2 l(e'i)A(e' 2 )) (3.2) 

by the equality 

(3.3) 

i 

where Pi stands for the Legendre polynomial of order i. It follows that 
the variance of the distribution, z.e. the correlation function at zero lag, is 
given by 

= = (3.4) 

I 

The angular power spectrum (or ^) characterises completely a Gaussian 
field, while all higher order functions are needed in principle to describe 
other cases. 

At low f the spectrum depends essentially on the power spectrum of 
the density fluctuations emerging from the early universe. Nevertheless, at 
higher spatial frequencies, t > 20, the spectrum depends also on the cos- 
mological parameters: the Hubble constant, the baryon density, the nature 
of the dark matter, the cosmological constant, the geometry of the uni- 
verse, and the recombination redshift. Bond [22] shows in detail how these 
cosmological parameters can be retrieved from the CMB power spectrum. 

Some can be obtained almost independently of the others. The best 
example is the geometry of the universe which is rather uniquely correlated 
with the position of the first acoustic peak. For a Euclidean universe, it 
should be observed at £ ~ 200 and move upward or downward according to 
curvature (see Fig. 6). The present data described in Section 5.4 show that 
this is very nearly the case. 

Nevertheless, the determination of the other parameters are degenerate 
if only low ^ observations with limited signal to noise are available. Bond [22] 
shows that one requires a power spectrum determination up to £ ~ 2500 
and an accuracy of a few 10“^ for the first acoustic peaks. This translates 
into observational requirements of an angular resolution of 5 arcmin and a 
sensitivity ^ ~ 2 x 10“® per pixel. 

Polarisation measurements of the CMB will help removing degeneracies 
and also check the consistency of the results since the intensity and polarised 
data should be correlated in a specific fashion. Furthermore, polarisation 
measurements are the best way to constrain the tensor to scalar ratio for 
the primordial perturbations. The degree of polarisation predicted for the 
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Fig. 6. Numerical simulations showing the effect of curvature (through 
mostly the dependence of the angular distance) on otherwise similar 
anisotropies. One clearly sees the stretching/compression of the flat case pat- 
tern. This figure was obtained form the web site of the BOOMERanG team 
(see www.physics.ucsb.edu/~boomerang/press_images from North- America and 
oberon.romal.infn.it/boomerang from Europe). 



CMB being of the order of 10% it leads to a similar sensitivity requirement. 
These are thus the goals of the third generation CMB experiments being 
built now. 



3.2 The secondary CMB anisotropies 

The secondary effects represent the ensemble of temperature fluctuations 
generated after recombination which are superimposed and added to the 
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primary fluctuations (other latter imprints with different spectral signatures 
are discussed below in the foregrounds Sect. 4). 

After recombination, the photons and the matter decouple. However, the 
photons can interact with the gravitational potential wells or the ionised 
matter they encounter along their trajectories. If the universe is mostly 
neutral, then the photons propagate freely in the universe and the local, or 
integrated, variations of the gravitational potential along the line of sight 
produce secondary brightness anisotropies. If the universe is ionised, homo- 
geneously or inhomogeneously, the photons interact by Compton effect with 
the matter generating secondary anisotropies. With respect to CMB mea- 
surements, the secondary effects result in spurious temperature anisotropies. 
However, they represent very important tools for our understanding of the 
matter distribution at large scales and its ionisation state. 

The secondary effects can be classified in the two following major types, 
namely those induced by gravity or reionisation, with several origins possible 
for each type. 

3.2.1 Gravitational effects 

• The Rees-Sciama effect [133] 

This effect takes place when the gravitational potential well crossed 
by the photons varies during the flight travel time. Indeed, if the 
photons fall into a gravitational potential well different from the one 
they go out of, this induces a gravitational redshift (of the Sachs- Wolfe 
type [139]). The variations in the potential wells are due essentially 
to the non-linear evolution of the structures. In the case when the 
potential well is deepened, the photons gained energy and are blue 
shifted while going towards the centre of a structure, and they lose 
even more energy than what they gain while going out. This difference 
results in a net red shift along that line of sight. There is thus an 
additional fluctuation imprinted, with no specific spectral signature 
to distinguish it from CMB primary anisotropies. 

The Rees-Sciama effect generates temperature fluctuations, with am- 
plitudes of about a few 10“^ to 10“®; its amplitude is maximum for 
scales between 10 and 40 arcmin (see Seljak [144]). At the degree 
scale, the Rees-Sciama contribution is of the order of 0.01 to 0.1% 
of the primary CMB power. This effect is therefore not expected to 
affect significantly the CMB results. 

• Integrated Sachs-Wolfe effect 

Along their lines of sight, the photons undergo an effect similar to 
the Rees-Sciama one with all the gravitational potential wells they 
encounter. It takes into account the global time variations of the 
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potentials, in particular during the linear phase (Hu and Sugiyama [89, 
90]). 

Second order non-linear effects 

These have been the subjects of many studies, like those of Martinez- 
Gonzales et al. and Sanz et al. [116,140], which aim at evaluating 
the second order perturbations due to gravitational potential wells. 
They showed that the quasi-linear and non-linear evolution of the 
perturbations generate, at low redshifts, secondary anisotropies with 
a maximum amplitude of about STjT ~ 10“® at the degree scale. 

Gravitational lensing effects 

The integrated Sachs- Wolfe effect can be viewed as an impulse (pos- 
itive or negative) given to the photons when they cross the density 
fluctuations along the line of sight. Gravitational lensing is responsi- 
ble for variations of the photons trajectory in the transverse direction 
(Blanchard and Schneider 1987; Cayon et al. 1993; Seljak 1996 [20, 
37,143]). In fact, gravitational lensing redistributes power in the tem- 
perature anisotropies between the different angular scales. This effect 
takes place at the degree scale and it becomes more significant on 
smaller scales, see Figure 8. The gravitational lensing effect blurrs 
slightly the image of the last scattering surface and can thus erase the 
fluctuations at very small scales. 

Gravitational lenses are not static, and they are likely to have bulk mo- 
tions across the line of sight. In this case, photons crossing the leading 
edge of a lens will be redshifted because of the increasing depth of the 
potential well during their crossing time; while photons crossing the 
trailing edge of the same structure are blue shifted (Birkinshaw and 
Gull 1983; Birkinshaw 1989 [15,17]). This results in a characteristic 
spatial signature for the induced anisotropy: a hot-cold temperature 
spot mimicking butterfly wings. Generalised to a population of col- 
lapsed objects (small groups to rich clusters) moving across lines of 
sight, the contribution of the induced secondary anisotropies is sig- 
nificantly smaller than the primary anisotropies as shown in Figure 9 
from Aghanim et al. [7]. 

Topological defects like cosmic strings, although not the main source of 
structures in the universe are still predicted by most relevant particle 
physics models. In fact many recent models predict the formation 
of cosmic strings at the end of an inflationary period. It is thus an 
important goal for the next generation GMB experiments to be able 
to search for the signatures of these. Figure 7 shows the structure 
of the GMB anisotropies generated by cosmic strings after the last 
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Fig. 7. A calculation by Bouchet, Bennett et al. (1998) of the fluctuations im- 
printed by (local) cosmic strings after the last scattering surface. The scale of the 
image is 7.2 degree and the AT /T fluctuations span from about —50 to -1-50 . 

The mass per unit length of the strings, /i, is likely to be around Giijc? ~ 10“®. 
These fluctuations add to those already present at the last scattering surface. 
Adapted from [25]. 

scattering surface^. These are secondary anisotropies also imprinted 
via a moving lens effect, like above. In fact, one may picture each 
infinitesimal element of a string as the source of a “butterfly” pattern. 

^Strings also generate fluctuations on the last scattering surface, but so many strings 
contribute to each scale that the result is rather Gaussian, which blurs the non-Gaussian 
signature of latter imprints by moving strings. 
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Fig. 8. CMB anisotropy power spectrum l{l + l)Ci versus I with leasing (dashed 
lines) and without leasing (solid lines). Upper curves are for adiabatic CDM 
model with h — 0.5, flmo = 0.4 and = 0.6, lower curves are for adiabatic 
CDM model with h = 0.5, fimo = 1 and = 0. Both models are normalized to 
COBE. Lensing smoothes the sharp features in the power spectrum, but leaves 
the overall shape unchanged. The two models show a typical range of the lensing 
effect on CMB. Reprinted from [143]. 

Their superposition then leads to a step-like discontinuity along the 
string, whose magnitude is proportionnal to the local string velocity 
transverse to the line of sight. 

Footprints for the strings are the non-Gaussian properties induced by 
their step- like discontinuities, although they will be hard to detect. In- 
deed (larger) last scattering surface fluctuations will be superimposed 
and the easiest signatures will be at very small angular scale (when 
the primary fluctuations die out), while still requiring the mapping of 
a substantial area. 

3.2.2 Effects of the reionisation 

The recombination of the universe at a redshift of about 1000 is a tran- 
sition phase between ionisation and neutrality. However, the absence of 
the characteristic Lyman-alpha absorptions for neutral hydrogen in the 
spectra of distant quasars (also called the Gunn-Peterson test, Gunn and 
Peterson 1965 [74]) indicates that the universe is totally ionised at a redshift 
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Fig. 9. A comparison of primary and secondary power spectra for the standard 
CDM model (solid line), open CDM model (dashed line) and lambda CDM model 
(dotted line), after Aghanim et al. (1998). The top lines show the primary fluc- 
tuations spectra, while the thick lines at mid-height of the plot correspond to the 
Sunyaev-Zel’dovich kinetic effect (velocity //to the line of sight). The bottom 
(thin) lines are for the moving lens effect (velocity T to l.o.s.). Reprinted from [7]. 



of about 5 (Songaila et al. [148]). Reionisation must then have occurred af- 
ter recombination and be completed by z ~ 5. The re-ionisation induces 
again the coupling between matter and radiation. In this context, secondary 
anisotropies are generated through two main effects: 



i) inverse Compton interactions, or the so called Sunyaev-Zeldovich ther- 
mal effect which gives a specific spectral distortion (non 6T) and as 
such is discussed as a foreground in Section 4.2.2, 



ii) doppler effect of the photons when the scattered ionised gas has a 
bulk motion with respect to the CMB, also called Sunyaev-Zeldovich 
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kinetic effect. The major secondary effects arising from local or global 
reionisation of the universe are the following: 

Ostriker- Vishniac effect 

This effect (Ostriker and Vishniac 1986; Vishniac 1987 [119,164]) rep- 
resents the second order anisotropies generated from the correlations 
of the density and velocity perturbations along the line of sight, when 
the universe is totally ionised. This effect produces anisotropies at 
the few arcminute scale whose amplitudes depend much on the ion- 
isation history of the universe (Dodelson and Jubas 1995; Hu and 
White 1996 [48,91]). However, they remain smaller than the primary 
anisotropies (see Fig. 10) for £ < 2000. 

Sunyaev-Zeldovich kinetic effect 

The inverse Compton scattering of the CMB photons on the free elec- 
trons of hot intra-cluster gas produces secondary anisotropies 
(Zeldovich and Sunyaev 1969; Sunyaev and Zeldovich 1972; 1980 [149, 
150, 171]). If the galaxy cluster moves, with respect to the CMB rest 
frame, the radial motion (along the line of sight) induces an additional 
anisotropy due to a first order Doppler effect. This anisotropy has no 
specific spectral signature to distinguish it from primary anisotropies. 
It is thus a source of confusion in CMB measurements. A population 
of galaxy clusters is expected to generate a distribution of temperature 
anisotropies whose power spectrum can be predicted [7, 10]. It shows 
that these anisotropies dominate over the primary at angular scales 
smaller than about 5 arcmin (see Fig. 9) with an expected turn over 
in the power spectrum at very small scales. It corresponds to the cut 
off in the mass distribution of galaxy clusters (Fig. 10, dashed line). 
A simulated map of this effect can be seen in Figure 19 and compared 
with that of the thermal SZ effect. 

The Sunyaev-Zeldovich kinetic effect can intervene in all types of col- 
lapsed objects containing ionised gas such as primordial galaxies host- 
ing super-massive black holes (Aghanim et al. [2]). In fact, recent stud- 
ies of galactic nuclei suggest that most galaxies are seeded by black 
holes which power the central nucleus. In this picture, the proto- 
galactic object is likely to have undergone a very active phase dur- 
ing which the surrounding medium was shocked, ionised and heated. 
The contribution of such a population of proto-galaxies to the an- 
gular power spectrum would constitute the major source of CMB 
anisotropies at the arc second and sub-arc-second scales (Fig. 10). 

Inhomogeneous reionisation 

Several ionisation processes have been proposed to achieve the total 
ionisation of the universe by z = 5. One of them is related to the 
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Fig. 10. Power spectra of the different sources of temperature anisotropies taken 
from the literature. The solid thin line represents the CMB primary anisotropy 
spectrum computed using the CMBFAST code [145]. The thick solid line rep- 
resents the SZ kinetic effect of the proto-galaxies for = 1. The thick dashed 
line is obtained for O = 0.3 [2]. The triple-dotted-dashed line represents the 
Vishniac-Ostriker effect contribution as computed by Hu and White [91] with a 
total reionisation occurring at Zi = 10. The dashed and dotted-dashed lines rep- 
resent respectively the galaxy cluster contribution due to the kinetic SZ effect (as 
estimated by Aghanim et al. [3]), for a cut-off mass of Mq, and the Rees- 
Sciama effect (taken from Seljak [144]), with cts = 1 and Qoh = 0.25. The dotted 
line represents the upper limit of the contribution due to the inhomogeneous reion- 
isation of the universe as computed by Aghanim et al. ]5] , for a quasar lifetime of 
10^ yrs. Reprinted from [2]. 



first emitting objects such as high redshift (5 < 2 < 10) quasars. The 
reionisation affects the CMB. When they emit, the first quasars (or 
stars, or galaxies) ionise the surrounding neutral gas. The proper 
motion of these ionised bubbles generates, by Doppler effect, sec- 
ondary anisotropies. Again, there is no spectral signature to distin- 
guish them from the primary anisotropies. This effect was studied 
by several authors (Aghanim et al. 1996 and 1999; Gruzinov and Hu 
1998; Knox et al. 1998 [5,6,69,101]). For the set of parameters that 
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maximises the effect, the contribution of the inhomogeneous re-ionisation 
dominates, at all scales larger than the arc second, the secondary ef- 
fects already mentionned (Fig. 10). 

Among all the sources of secondary anisotropies, the ones due to interac- 
tions with the gravitational potential wells (the gravitational effects) have 
very small amplitudes. Therefore, they are not expected to contribute sig- 
nificantly to CMB measurement and introduce spurious signal. By con- 
trast, the effects arising from interactions with ionised matter after recom- 
bination, in collapsed structures or in ionised bubbles, induce temperature 
anisotropies of larger amplitudes; they can significantly contribute to the sig- 
nal. Still, the primary CMB anisotropies dominate at all scales larger than 
the damping scale around 5 arcmin. At intermediate scales, several effects 
take place among which the inhomogeneous reionisation and the SZ effect 
are the largest effects. At very small scales, the temperature anisotropies 
could be totally dominated by the contribution of proto-galaxies seeded by 
black-holes. 

4 Astrophysical foregrounds 

4.1 Physics of galactic foregrounds 

The galactic emissions are associated with dust, free-free emission from 
ionised gas, and synchrotron emission from relativistic electrons. We are 
still far from an unified model of the galactic emissions. On the other 
hand, the primary targets for CMB determinations are regions of weakest 
galactic emission, while stronger emitting regions are rather used for studies 
of the interstellar medium. In this chapter, we describe the physics of these 
emissions which is relevant for the high galactic latitude part of the sky for 
which the emission comes essentially from the diffuse interstellar medium. 

4.1.1 Dust emission 

This restriction to the best half of the sky results in considerable simplifica- 
tion for modelling the dust emission, since there is now converging evidence 
that the dust emission spectrum from high latitude regions with low HI 
column densities can be well approximated with a single dust temperature 
and emissivity, between 300 GHz and 2 THz. There is no evidence for a 
large amount of very cold dust in these regions. 

The spectrum of the dust emission at millimetre and sub- 
millimetre wavelengths has been measured by the Far-Infrared Absolute 
Spectrophotometer (FIRAS) aboard COBE with a 7 degree beam. In the 
galactic plane, dust along the line of sight is expected to spread over a rather 
wide range of temperatures just from the fact that the stellar radiation field 
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varies widely from massive star forming regions to shielded regions in opaque 
molecular clouds. A detailed analysis of the distribution of cold dust and 
its temperature has been carried out by Lagache et al. [106], who show that 
a component of cold dust around 12 K does exist in the interstellar medium 
but that the cold fraction decreases when going towards clouds with smaller 
column density. A consistent picture for the high galactic latitude regions 
thus emerges. 

Boulanger et al. [32] analysed the far-IR and sub-mm dust spectrum 
by selecting the fraction of the FIRAS map, north of —30° declination, 
where the HI emission is weaker than 250 K.km s“^ (36% of the sky). The 
declination limit was set by the extent of the HI survey made with the 
Dwingeloo telescope by Burton and Hartmann [36]. For an optically thin 
emission this threshold corresponds to fV(HI) = 5 x 10^° cm“^. Below 
this threshold, the correlation between the FIRAS and Dwingeloo values 
are tight. For larger HI column density, the slope and scatter increase, 
which is probably related to both optical depth effects for the 21 cm line 
and to the contribution of dust associated with molecular hydrogen (Savage 
et al. 1997 [141]). Boulanger et al. found that the spectrum of the HI- 
correlated dust emission seen by FIRAS is well fitted by one single dust 
component with a temperature T = 17.5 K and an optical depth t/A^h = 
1.0 X 10-25(A/250 

The spectrum of the emission at high latitude does not fit the spectrum 
of the H I-correlated emission. Reach et al. [132] have shown that it can be 
fitted by a two-temperature model. Nevertheless, the residuals to this one 
temperature fit already allow to put an upper limit on the sub-mm emission 
from a very cold component almost one order magnitude lower than the 
value claimed by Reach et al. [132]. Assuming a temperature of 7 K for 
this purported component one can set an upper limit on the optical depth 
ratio r(7 K)/r(17.5 K) ~ 1. Dwek et al. [53] have also derived the emission 
spectrum for dust at Galactic latitudes larger than 40°, by using the spatial 
correlation of the FIRAS data with the 100 /rm all-sky map from the Diffuse 
Infrared Background Experiment (DIRBE). They compared this spectrum 
with a dust model not including any cold component. The residuals of this 
comparison show a small excess which could correspond to emission from 
a very cold component at a level just below the upper limit set by the 
Boulanger et al. [32] analysis. Both studies thus raise the question of the 
nature of the very cold component measured by Reach et al. [132]. 

Puget et al. [128] found that the sub-mm residual to the one temper- 
ature fit of Boulanger et al. [32] is nearly isotropic over the sky. To be 
Galactic, the residual emission would have to originate from a halo large 
enough (> 50 kpc) not to contradict the observed lack of Galactic latitude 
and longitude dependence. Since such halos are not observed in external 
galaxies, Puget et al. [128] suggested that the excess is the Extragalactic 
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Background Light (EBL, also called the Cosmic InfraRed Background - 
CIRB) due to the integrated light of distant IR sources. Since then this 
cosmological interpretation has received additional support. Schlegel et al. 
and Hauser et al. [81, 142] also detected a residual somewhat higher than 
the one obtained by Guiderdoni et al., Lagache, Lagache et al., and Fixsen 
et al. [58,71,104,105]. The determination of the extragalactic component 
in Schlegel et al. [142] is not very accurate from their own estimate. 

As pointed out by Puget et al. [128] the main uncertainty in the first 
detection of the Cosmic Infrared Background was the difficulty to estimate 
the emission of dust associated with ionised gas in the interstellar medium. 
The distribution of the ionised gas over the sky is partly correlated with 
the neutral component traced by the 21 cm line of HI. A most conservative 
hypothesis to establish the existence of the CIRB was done by Puget et al., 
by assuming that this correlated fraction is small and that dust in the 
ionised gas has the same emissivity per proton as the dust associated with 
the neutral component. Schlegel et al. and Hauser et al. [81, 142] have not 
been able to identify the emission of the dust in the ionised gas and give 
evaluations of the CIRB without trying to estimate and remove it. This 
leads to higher estimates, while Lagache et al. [105] has shown the existence 
of a galactic component likely to be associated with the ionised gas. 

The first analysis of Puget et al. was also redone by Guiderdoni et al., 
Lagache and Lagache et al. [71, 104, 106] in a smaller fraction of the sky 
where the HI column density is very small (A^hi < 1 x 10^° Hcm“^, mostly 
around the Lockman hole). In these regions, the interstellar dust emission 
is essentially negligible for frequencies below 1.5 THz and the CIRB is actu- 
ally the dominant component. This leads to a much cleaner determination 
of the background spectrum which is slightly stronger than the original de- 
termination by Puget et al. As we shall see below in Section 4.2.1, the 
cosmological interpretation fits well with the results of recent IR searches 
for the objects that cause this background. 

More recently Lagache et al. [107] using the WHAM H^ survey has ob- 
tained a separation of the emission due to dust in the neutral and ionised 
components of the interstellar medium. They find that the dust emis- 
sion spectrum and normalisation per proton in the gas are very similar. 
This allows also to extend the spectrum of the CIRB up to a frequency of 
3 THz. 

In summary, there is now converging evidence that the dust emission 
spectrum from most of the high latitude regions with low interstellar gas 
column densities can be well approximated with a single dust temperature 
and emissivity. 

Of course, this only holds strictly for the emissions averaged over the 
rather large lobe of the FIRAS instrument, and for regions of low column 
density. In denser areas like the Taurus molecular complex, there is a dust 




132 



The Primordial Universe 




-l.70e+00^^^_ 4 



Fig. 11. Map of the emission at 100 fim, with a logarithmic scale. It is a repro- 
cessed composite of the COBE/DIRBE and IRAS/ISSA maps, with the zodiacal 
foreground and conhrmed point sources removed. Processing details may be found 
in Schlegel et al. (1998) from which this is reprinted [142]. 



component seen by IRAS at 100 /rm and not at 60 /rm. This cold IRAS 
component is well correlated with dense molecular gas as traced by 
emission [1,106,113,138]. 

Schlegel et al. [142] using the colour of the emission measured in the 
DIRBE maps produced maps of dust column density which are shown in 
Figure 11. 

At the longest wavelengths observed with FIRAS a small excess remains 
present even in HI clouds when the column density of hydrogen exceeds 
A^hi > 1 X 10^° Hcm“^. This is likely to be associated with a component 
of colder dust at 12 K similar to that found in molecular clouds and con- 
centrated in the denser filaments of the cirrus clouds. An example of such 
a behaviour was demonstrated by Bernard et al. [13] for a dense filament in 
a high latitude cirrus by observations with the balloon borne SPM experi- 
ment at the focal plane of the PRONAOS telescope. The observed filament 
is rather thick but illustrates how similar but weaker such filaments could 
modify the basic single temperature model. The question of the number 
density and properties of such filaments in the thin cirrus is an open ques- 
tion which is likely to be resolved in the near future by the new generation 
of balloon experiments working at long wavelengths with better brightness 
sensitivity (MAXIMA, BOOMERanG and ARCHEOPS). 

Draine and Lazarian [52] recently proposed a new long wavelength com- 
ponent produced by electric dipole rotational emission from very small dust 
grains under normal interstellar conditions. Such an emission is very likely 
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to be present as the observational evidence for very small grains out of equi- 
librium with the interstellar radiation is now overwhelming (see for example 
Puget and Leger [131] and Boulanger [31]). Nevertheless the exact level of 
this emission is difficult to compute from first principles and it has to be 
measured. The emission is predicted to peak at a frequency of about 30 GHz 
and is not likely to account for the small excess at long wavelengths in the 
FIRAS data. 

In the following, we assume that the CMB data analysis will be per- 
formed only in the simpler but large connected regions of low H I column 
density and use the first order “one component” model. 

Concerning the scale dependence of the amplitude of the fluctuations, 
Gautier et al. [63] found that the power spectrum of the 100 /rm IRAS 
data is decreasing approximately like the 3'''^ power of the spatial frequency, 
£, down to the IRAS resolution of 4 arcmin. More recently, Wright [170] 
analysed the DIRBE data by two methods, and concluded that both give 
consistent results, with a high latitude dust emission spectrum, C{£), also 
oc in the range 2 < £ < 300, i.e. down to 0 ~ 60/^ ~ 0.2 deg (with 
^(lO)!/^ ~ .22 MJy/sr at | 6 | > 30). 

Deep surveys with the ISOPHOT instrument on-board the ISO satellite 
have been taken at high galactic latitude by Herbstmeier et al. , Kawara 
et al. , Puget et al. [84,96,130]. These surveys aimed at determining the 
spatial structure of the far infrared background found an emission with a 
power spectrum oc £~^ in regions with rather strong cirrus emission. In re- 
gions with very low cirrus, Lagache and Puget and Lagache et al. [107,108] 
show that the emission observed with high signal to noise (the FIRBACK 
survey) has a power spectrum that flattens at high spatial frequencies when 
it becomes dominated by the extragalactic fluctuations. The power spec- 
trum from the FIRBACK survey is shown in Figure 12; the similarity with 
the predictions in Guiderdoni et al. [71] is rather striking. 

Gautier et al. [63] also established long ago the normalisation of the 
power spectrum of the interstellar infrared emission with the brightness. 
P{k) = Pok~^ with Pq = 1.4 x 10“^^ Jy^/sr with ko = 10“^ and Bo the 
brightness at 100 /rmin Jy/sr. Nevertheless the normalisation established 
from the IRAS data is quite uncertain at low brightness for thin cirrus. 
The FIRBACK observations provides the best calibration of Pq at 170 /rm: 

Po = 3.3 X 106 (Ri 7 o)' " = 3.3 X 10^ Jy2/sr. (4.1) 

where R 170 is the brightness at 170 /rmin MJy/sr. Using the spectrum of 
thin cirrus clouds this corresponds to £0^^ = foK at 100 GHz. This 

normalisation factor is lower by a factor 1.6 than the one derived by Bouchet 
and Gispert [27] by another method using the IRAS data. The difference 
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Fig. 12. (a) Power spectrum of the source-subtracted “marano-1” field (stars) 
of the FIRBACK survey, once the instrumental noise power spectrum (dots) has 
been subtracted. The power spectrum from the cirrus is given by the dashed line, 
(b) Extragalactic source power spectrum in the “ELAIS-N2” field (the vertical 
line shows the angular resolution cutoff). Reprinted from [107]. 



is well within the uncertainties of such determinations. The calculations in 
following sections are carried out with the higher value. 



4.1.2 Free-free emission 

Observations of Hq emission at high galactic latitude as well as disper- 
sion measurements in the direction of pulsars indicate that low density 
ionised gas (the Warm Ionised Medium, hereafter the WIM) accounts for 
about 25% of the gas in the Solar Neighbourhood [137]. The column 
density from the mid-plane is estimated to be in the range 0.8 to 1.4 x 
10^°cm“^. Until recently, little was known about the spatial distribution 
of this gas but numerous Hq, observing programs are currently in progress^. 
From these Hq observations one can directly estimate the free-free emission 
from the WIM (since both scale with the emission measure oc / rigdl, see 
Valls-Gabaud [161] for a careful discussion). 



^An important project is the northern sky survey started by Reynolds (WHAM 
Survey). This survey consist of Hq, spectra obtained with a Fabry-Perot through a 1° 
aperture. The spectra will cover a radial velocity interval of 200 km s“^ centred near the 
local standard of rest with a spectral resolution of 12 km s“^ and a sensitivity of 1 cm“® 
pc (5cr). Several other groups are conducting Hq observations with wide field camera 
(10°) equipped with a CCD and a filter (Gaustad et al. [62]). These surveys with an 
angular resolution of a few arc minutes should be quite complementary to the WHAM 
data and should allow to cover about 90% of the sky. 
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Kogut et al. [102] found a correlation at high latitudes between the 
DMR emission (after subtraction of the CMB dipole and quadrupole) and 
the DIRBE 240 /xm map. The observed change of slope of the correlation 
coefficients at 90 GHz is just what is expected from the contribution of 
the free-free emission as predicted by Bennett et al. [12]. The spectrum 
of this emission may be described by Ii, oc (the index is the best 

fit value of Kogut et al. [103]). More recently Oliveira et al. [43] cross- 
correlated the Saskatoon data [156] with the DIRBE data and also found 
a correlated component, with a normalisation in agreement with that of 
Kogut et al. [102]. This result thus indicates that the correlation found by 
Kogut et al. [102] persists at smaller angular scales. 

Lagache et al. [107] studied the correlation of the DIRBE and FIRAS 
data with part of the WHAM survey. They find that the emission nor- 
malised per proton is very similar to the dust emission associated with H I 
both in spectrum and in intensity. 

Veeraraghavan and Davis [162] used Ha maps of the North Celestial 
Pole (hereafter NCP) to determine more directly the spatial distribution 
of free-free emission on sub-degree scale. Their best fit estimate is C® = 
l-3lj'7 £-2.27 ±o.o 7 gg GHz, if they assume a gas temperature 

~ 10^ K. While this spectrum is significantly flatter than the dust 
spectrum, the normalisation is also considerably lower than that deduced 
from Kogut et al. [102]. Indeed, their predicted power at £ = 300 is then a 
factor of 60 below that of the COBE extrapolation if one assumes an £~^ 
spectrum for the free-free emission. Additionally Leitch et al. [115] found a 
strong correlation between their observations at 14.5 and 32 GHz towards 
the NCP and IRAS 100 ^m emission in the same field. However, start- 
ing from the corresponding Hq, map, they discovered that this correlated 
emission was much too strong to be accounted for by the free-free emission 
of ~ lO"* K gas. These and other results suggest that only part of the 
microwave emission which is correlated with the dust emission traced by 
DIRBE at 240 /xm may be attributed to free-free emission as traced by Ha. 
The missing part could be attributed to free- free emission uncorrelated with 
the dominating component at 240 /xm. 

As mentioned in the previous section, Draine and Lazarian [52] recently 
proposed a different interpretation for the observed correlation as the elec- 
tric dipole rotational emission from very small dust grains. Given the prob- 
able discrepancy noted above between the level of the correlated component 
detected by Kogut et al. [102] and the lower level of the free-free emission 
traced by Ha, it appears quite reasonable that a substantial fraction of this 
correlated component indeed comes from spinning dust grains. The spec- 
tral difference between free-free and spinning dust grains shows up mostly 
at frequencies ^ 30 GHz, i.e. mostly outside of the range probed by MAP 
and Planck. However the spatial properties may be different. Indeed the 
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spinning dust grains are probably very small grains emitting out of thermal 
equilibrium in the 15 to 60 /rm range. This range might then be the right 
one to be used as spatial template for this low frequency component. The 
free-free emission might decorrelate from the dust emission at very small 
scales (since HII regions might be mostly distributed in “skins” surround- 
ing dense HI clouds as proposed by McKee and Ostriker [117], a model 
supported by comparisons of IRAS and Hq, maps). 

In the following we shall assume for simplicity that in the best half of the 
sky (therefore avoiding dense H I clouds) the correlated microwave emission 
with Ii, oc in the MAP and Planck range is well traced at all 

relevant scales by the dust emission which dominates at high frequency and 
is well traced by HI. Still there may well be an additional free-free emission 
which is not well traced by the H I-correlated dust emission. Indeed the 
correlated emission detected by Kogut et al. [102] might be only part of the 
total signal with such a spectral signature that DMR detected. While all 
of this signal may be accounted for by the correlated component, the error 
bars are large enough that about half of the total signal might come from 
an uncorrelated component^. This is what we shall assume below. 

4.1.3 Synchrotron emission 

Away from the Galactic plane region, synchrotron emission is the dominant 
signal at frequencies below ~ 5 GHz, and it has become standard prac- 
tice to use the low frequency surveys of Haslam [80] at 408 MHz shown in 
Figure 13 and Reich [134] at 1420 MHz to estimate by extrapolation the level 
of synchrotron emission at the higher GMB frequencies. This technique is 
complicated by a number of factors. The synchrotron spectral index varies 
spatially due to the varying magnetic field strength in the Galaxy (Lawson 
et al. [114]). It also steepens with frequency due to the increasing energy 
losses of the electrons. Although the former can be accounted for by de- 
ducing a spatially variable index from a comparison of the temperature at 
each point in the two low frequency surveys, there is no satisfactory infor- 
mation on the steepening of the spectrum at higher frequencies. As detailed 
by Davies et al. [40], techniques that involve using the 408 and 1420 MHz 
maps are subject to many uncertainties, including errors in the zero levels 
of the surveys, scanning errors in the maps, residual point sources and the 
basic difficulty of requiring a large spectral extrapolation (over a decade in 
frequency) to characterise useful GMB observing frequencies. Moreover, the 
spatial information is limited by the finite resolution of the surveys: 0.85° 
FWHM in the case of the 408 MHz map and 0.6° FWHM in the case of the 



®At 53 GHz, on the 10° scale, the total free-free like signal is ATff = 5.2 ± 4.2 /iK, 
while the correlated signal is ATcor = 6.8 ± 1.6 fiK.. If we assume ATcor = 5.2 fiK and 
ATuncor = ATcor, we have ATfj = 7.3 /iK which is well within the allowed range. 
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Fig. 13. Emission of the sky at 408 MHz in aitoff projection, as surveyed by 
Haslam {e.g. [80]). The dominant contribution is the synchrotron emission of the 
Galaxy. 



1420 MHz one. But additional information is available from existing CMB 
observing programs. 

For instance the Tenerife CMB project (Hancock et al. (1994), Davies 
et al. (1995) [39,78]) has observed a large fraction of the northern sky with 
good sensitivity at 10.4 and 14.9 GHz (and also at 33 GHz in a smaller 
region) . A joint likelihood analysis of all of the sky area implies a residual 
rms signal of 24 at 10 GHz and 20 fj.K at 15 GHz. Assuming that 
these best fit values are correct, one can derive spectral indices of a = 
— 1.4 between 1.4 and 10.4 GHz and a = —1.0 between 1.4 and 14.9 GHz. 
These values, which apply on scales of order 5°, are in agreement with those 
obtained from other observations at 5 GHz on ~ 2° scales (Jones et al. [94]). 

At higher frequency, the lack of detectable cross-correlation between the 
Haslam data and the DMR data leads Kogut et al. [103] to impose an 
upper limit of a = —0.9 for any extrapolation of the Haslam data in the 
millimetre wavelength range at scales larger than ~ 7° . In view of the other 
constraints at lower frequencies, it seems reasonable to assume that this 
spectral behaviour also holds at smaller scales. 

The spatial power spectrum of the synchrotron emission is not well 
known and despite the problems associated with the 408 and 1420 MHz 
maps it is best estimated from these. Bouchet and Gispert have computed 
the power spectrum of the 1420 MHz map for the sky region discussed above. 
The results show that at ^ ^ 100 the power spectrum falls off approximately 
as {i.e. with the same behaviour as the dust emission). 
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Fig. 14. Effect of redshifting a template starburst galaxy spectrum (left). Long- 
ward of ~ 300 fim, the flux of such galaxies are similar at all 2 ; .ii, 0.5. 



4.2 Physics of the extraga lactic sources foregrounds 
4.2.1 Infrared galaxies and radio sources 



Here, we evaluate the contribution from the far infrared background pro- 
duced by the line of sight accumulation of extragalactic sources, when seen 
at the much higher resolution foreseen for future CMB experiments than 
the resolution of the FIRAS instrument which was used to detect the back- 
ground in the first place. 

Given the steepness of rest-frame galaxy spectra long-ward of ~ 100 /rm, 
predictions in this wavelength range are rather sensitive to the assumed 
high-z history. Indeed, as can be seen from Figure 14, at a wavelength 
larger than 800 /xm a starburst galaxy at z = 5 might be more luminous 
than its z = 0.5 counterpart, because the redshifting of the spectrum can 
bring more power at a given observing frequency than the cosmological dim- 
ming {e.g. see Blain and Longair in [19]), precisely in the range explored by 
bolometric detectors like those of BOOMERanG, MAXIMA, ARGHEOPS, 
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and the High Frequency Instrument (HFI) instrument aboard Planck. 
This “negative K-correction” means that: 

i) predictions might be relatively sensitive to the redshift history of 
galaxy formation; 

ii) variations of the observing frequency should imply a partial decorrela- 
tion of the contribution from distant galaxies, i.e. one cannot stricto 
sensu describe this contribution by spatial properties (i.e. a spatial 
template) and a single frequency spectrum (but see below that this 
might be a reasonable first approximation) . 

Estimates of the contribution of radio-sources and infrared galaxies to the 
anisotropies of the microwave sky have to rely on two types of measurements: 

i) the spectrum of the integrated power from all galaxies in the past (the 
extragalactic background light); 

ii) deep surveys of faint extragalactic sources in the whole wavelength 
range going from the thermal infrared to the radio. These may give 
source counts and the spatial power spectrum of the unresolved back- 
ground. 

A large set of relevant observations has become recently available. 

• The cosmic background is known with an accuracy better than 50% 
from the far ultraviolet to the microwave range. This is displayed 
in Figure 2. The most relevant part of this diagram for CMB ob- 
servations is the part which covers frequencies smaller than 1 THz. 
Nevertheless, the shorter wavelengths also provide strong constrains 
when building a model for the evolution of infrared galaxies; 

• Deep source counts have been obtained in the far infrared using the 
ISOPHOT instrument by Kawara et al., Puget et al., and Dole et al. 
[51,95,129]. They find steeply rising number counts between 1 Jy 
and 100 mJy, indicative of a very strong evolution for the popula- 
tion of infrared bright objects detected. This population is probably 
closely related to that detected in the mid infrared at 15 /rm with the 
ISOCAM instrument in the very deep surveys conducted in the Hubble 
deep field and other fields [43,55]. At 850 /rm deep surveys conducted 
with SCUBA also show very steep number counts (Blain et al. 1998; 
Small et al. 1997; Barger et al. 1999; Bales et al. 1999) [11,18,54,146]; 

• The power spectrum of brightness fluctuations at 170 /rm has been 
obtained by Lagache et al. [107,108] in very low cirrus regions. These 
fluctuations are likely due to fluctuations of the extragalactic back- 
ground (see Fig. 12). The extragalactic background power spectrum 
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Fig. 15. Deep survey at 170 fim obtained with the ISOPHOT instrument aboard 
the ISO satellite. Individual galaxies can be seen on the more extended struc- 
tures of the thin galactic cirrus clouds. Further details may be found in Dole 
et al. (2000). Reprinted from [50]. 



in the range 0.1 < k < 0.5 arcmin ^ is compatible with white noise 
and has a value Pq = 7000 ± 3000 Jy^ sr“^. 

These observations can be used through phenomenological models describ- 
ing the luminosity function as a function of emitted frequency and redshift 
which are constrained by these data, as in Toffolatti et al. [160], or to 
explore the range of redshift distributions which are compatible with the 
CIBR spectrum, as in Gispert et al. [65]. Alternatively, one may devise 
ab initio models that include all known physical constraints within a given 
cosmological paradigm (z.e. the hierarchical growth of structures)^. 



^These models attempt to include all accrued knowledge from studies of large scale 
structure formation - governed by the gravitational dynamics of dark matter - and 
implement basic laws for describing the dissipative baryonic physics. In short, one starts 
from a matter power spectrum (typically a standard, COBE- normalised CDM), and 
estimates the number of dark matter halos as a function of their mass at any redshift 
{e.g. using the Press-Schechter approach). Standard cooling rates are used to estimate 
the amount of baryonic material that forms stars, which are then laid down according 
to an assumed Initial Mass Function, on a timescale related to the local dynamical time. 
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Following the work of White and Frenk [165], this has become standard 
practice for predicting properties in the UV and optical bands; despite some 
difficulties, these models have been rather successful, since many new ob- 
servations fit naturally in this framework. This type of physical modeling 
was extended to long wavelength by Guiderdoni et al. [72] and it also met 
some success®. In particular, their model “E” predicted rather well the FIR- 
BACK results concerning the 170 /rm number counts and the spatial power 
spectrum of the background. It is now known though that the red shift 
distribution of that “E” model has too long a high-z tail and a new gener- 
ation of such models is being developped by several groups to include the 
new constraints. In the mean time, we shall use that E model for numerical 
estimates and comparisons with other sources of microwave fluctuations. 

In what follows, we shall assume that resolved sources have been re- 
moved, and we focus on the spatial and spectral properties of the remaining 
unresolved background. That can be done only by assuming a specific in- 
strumental configuration giving the level of detector noise and a geometrical 
beam for each channel. Indeed the variance of the shot noise contributed by 
N sources of flux S randomly distributed on the sky is <t^ = NS'^ and the 
corresponding power spectrum is C{£) = ( 2 !)^ ^ population of 

sources of flux distribution per solid angle dN/dS/dfl the power spectrum 
is, after removing all sources brighter than Sc 



C{£) 



^conf 

Q2 

^ ^beam 



1 



f^beam 




diV 

dnds 



S'^dS. 



(4.2) 



1 /2 

If the cut Sc is defined as Sc = q + ^-dust + o'Fnstrum + o-cmb • ■ • ) 
with q fixed to 5, this formula allows to derive iteratively the confusion limit 
Cconf- This is the standard deviation of the fluctuations of the unresolved 
galaxy background, once all sources with S > 5atot {i.e. including the other 
sources of fluctuations, from cirrus, CTdust, detector noise Cinstrum) the CMB, 
(Jqmb) have been removed. 

Table 2 summarises sensitivity limits and expected number counts for 
the E-model results for the HFI instrument [70, 124] using the evaluation 



The stellar energy released is obtained from a library of stellar evolutionary tracks and 
spectra; computations may be then performed to compare in detail with observations. 

^In this early models, no attempt was made to compute the fraction of obscured star- 
bursts versus redshift, although this study showed that these dust-enshrouded starburst 
are a necessary ingredient to account for the CIRB spectrum. Instead several scaling 
laws were assumed and the corresponding observables predicted, each assumption corre- 
sponding to a specific model. Another limitation of these early models was the assumed 
one-to-one correspondance between a given infrared bolometric luminosity (fully com- 
puted in the model) and a particular infrared Spectral Energy Distribution, although 
this correspondance was made to reproduce the observed correlations found in earlier 
studies. 
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Table 2. Theoretical estimates from model E of [71] and from simulations. 
(1) HFl wavebands central frequency in GHz. (2) Beam full width half max- 
imum in arcmin. (3) 1 a instrumental noise for 14 month nominal mission. 
(4) la fluctuations due to cirrus at A^hi = 1-3 x 10^° cm“^ (level of the clean- 
est 10% of the sky). The fluctuations have been estimated following Gautier 
et al. (1992) with P{k) oc and Po,ioO(jm = 1-4 x 10“^^i?o,ioo Mm- (5) 

1 CT GMB fluctuations for AT jT — 10“®. (6) 1 <t confusion limit due to FIR 

sources in beam Q. = 0|whM) defined by Cconf = S'^(dA^/dS)dS 

The values (Jconf and S'um = qatot have been estimated iteratively with q = 5. 

(7) atot = (r^ins + <^?onf + Here n"cir is for IVhi — 1.3 x 10 cm 

(8) Surface density of FIR sources for Sum ~ 5<Ttot. (9) As seen in Figure 16, 
simulations detailed in Section 7.3 suggest a detection threshold of ~ 50 mjy 
at 353 GHz, yielding a source density close to the confusion for 5 arcmin beam 
defined as 1 source every 50 beams. 
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of confusion described above. These number are compared with numerical 
simulation of the source extraction process using Toffolatti et al. model [160]. 
A striking feature is the very high level of the confusion limit computed 
analytically with respect to the simulated one. It translates into rather 
different number of detected sources. 

One should realize though that the analytical estimate is very pessimistic 
since it assumes a very naive source removal scheme by a simple threshold- 
ing. In practice one would at least use a compensated filter {e.g. by do- 
ing aperture photometry) which removes the contributions to the variance 
at low spatial frequencies®. A much larger number of sources would then 
be removed since the counts are very steep. One should thus regard the 



®A compensated filter (whose integral is naught) essentially nulls the contribution of 
fast decreasing power spectra contributors, like dust, whose contribution to the variance 
is concentrated at large scales. And it also decreases the amount of white noise from 
the detector and the unresolved sources themselves, since then sources have to stand out 
above a reduced threshold. 
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Fig. 16. Source detection in the 44 GHz and 353 GHz channel of Planck after 
Hobson et al. (1999). Top: input source maps. Middle: MEM recovered source 
map at the same frequency. Gray scales denote the flux density in Jy. Bottom: 
comparison of input and output source counts obtained using SExtractor. The 
extraction is essentially complete down to 70 mjy. Reprinted from [87]. 



analytical confusion level derived above as an upper limit. The raw numer- 
ical result of Hobson et al. [87] (see Fig. 16) suggest a detection threshold 
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around 70 mJy, when Toffolatti et al. model [160] is used. This source 
model might underestimate the confusion limit, but there are still more 
than 120 000 sources at 350 /rm (860 GHz) at flux > 100 mJy. 

The spectral dependence of the confusion limit can be approximately 
modelled as that of a modified black body with an emissivity oc and 
a temperature of 13.8 K. A more precise fit (at the ~ 20% level) at > 
100 GHz for the equivalent temperature fluctuation power spectrum (using 
relative bandwidths of 0.25) is given by 



£C'(£)1/2 



7.1 X 10"® 

qx/2.53 ^ 




sinh^ X 

2 , 0.3 



£[K], 



(4.3) 



with X = hi>/2kTo = i^/(113.6 GHz). This gives a value of = 

0.005 /iK at 100 GHz. At lower frequencies < 100 GHz), this model 
yields instead 

~ 6.3 X 10"® (0.8 - 2.5x + 3.38x2) sml^ ^ 

One should remember though (see Fig. 14) that the relative fluxes in dif- 
ferent observing bands can vary depending on the redshift of the object. 
Gonversely, different bands do not weight equally different redshift inter- 
vals. The previous analysis thus does not tell us whether the fluctuation 
pattern at a given frequency is well correlated with the pattern at another 
frequency. Of course, the answer will depend on the resolution of the maps 
and the noise level. 

To help answer this question for the Planck-HFI instrument [124], 
maps of the background from unresolved galaxies contribution were gen- 
erated at 30 different frequencies^. As shown in Figure 17b, the cross- 
correlation coefficients of the maps (once the 5atot sources have been re- 
moved) is better than 0.95 in the 100 — 350 GHz range, better than 0.75 
in the 350 — 850 GHz, and it is still ^ 0.60 in the full 100 — 1000 GHz 
range. This suggests that one can treat the IR background from unresolved 
sources reasonably well as just another template to be extracted from the 
data, with a fairly well defined spectral behaviour, at least in the range 
probed by the HFI. Note though that the model may not do full justice to 
the diversity of spectral shapes of the contributing IR galaxies. The main 
uncertainty in the predictions of the degree of cross correlation between 
different frequencies is nevertheless the uncertainty in the redshift distri- 
butions of the infrared galaxies that accounts for the present data. More 
secure predictions will have to await the completion of redshift follow-up of 
the new far-infrared/submillimetre catalogs . . . 



^Details of the procedure are described in Guiderdoni et al. and Hivon et al. [73,86], 
but the results quoted in the text correspond to the newer E model. 
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Fig. 17. (a) Absolute flux in each pixel as a function of frequency; the maps have 
been convolved with a Gaussian beam of 5' FWHM. The solid black line show 
the mean spectrum, the median spectrum is denoted by dashes, while the various 
contours encircle 10, 30 70 and 90% of the pixel values, (b) Cross-correlation 
coefficients between the full resolution maps. Adapted from [86]. 



While the model “E” above might well be the best available for bound- 
ing the infrared sources properties, it does not take into account the con- 
tribution from low-frequency point sources like blazars, radio-sources, etc. 
which are important at the lower frequencies probed by MAP and the Low 
Frequency Instrument on Planck(LFI). Using only that model would 
under-estimate the source contribution at low frequencies. Fortunately, 
Toffolatti et al. [160] made detailed predictions in the LFI range which 
give 






5.7sinh^(:^/113.6) 

(l//1.5)(4-75-0.1851og(i./1.5)) ^ 



(4.5) 



This yields = 0.02 /iK at 100 GHz®. Here again, we assume that 

this unresolved background has well-defined spectral properties. 

In order to obtain a pessimistic estimate of the total contribution from 
sources, we shall consider in the following two uncorrelated backgrounds 
from sources, the one contributed by IR sources, as described by the 



®In the following, we shall use that prediction when analysing the expected perfor- 
mances of MAP, although in that case the remaining unresolved background should be 
somewhat higher due to the lower sensitivity and angular resolution of MAP. 
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equations (4.3) and (4.4), and the one contributed by radio-sources, as de- 
scribed by equation (4.5). The corresponding levels are compared with the 
other sources of fluctuations in Figure 21 and they are quite low (at least at 
i ^ 1000) as compared to the expected fluctuations from the CMB or from 
the Galaxy. The fluctuation level shown in Figure 21 is obtained with a sim- 
ple Poisson distribution of the galaxies. The expected correlation in their 
distribution should increase somewhat the low-1 part of these predictions. 



4.2.2 Sunyaev-Zeldovich effect 

In addition to primary and secondary CMB anisotropies which have a ST 
spectrum, other anisotropies generated through Compton interactions will 
not have a 6T spectrum. These anisotropies are referred to as the (thermal) 
Sunyaev-Zeldovich effect (hereafter SZ) in which CMB photons are scattered 
by free electrons in the hot gaseous component of rich clusters of galaxies. 
This translates in an apparent temperature decrement in the low frequency 
(Rayleigh-Jeans) side of the spectrum, and a temperature excess in the 
Wien side [171]. The intensity variation of the CMB, AI^/I^, = y x f{x) is 
controlled by the Compton parameter 

y = f Te{l)ne{l)dl, (4.6) 

TTleC^ J 

where Te and rie stand for the electron temperature and density. The 
spectral form factor in the non relativistic limit depends only on the a- 
dimensional frequency x = hv /kTcMB according to 



f{x) 



xe^ \ fe^ + l\ ; 
(e^ - 1) ^ \e“ - 1 ) ~ 



(4.7) 



This Compton distortion has a spectrum which is very different from a ST 
spectrum and thus can be treated as a foreground and removed by the same 
methods as the other foregrounds. It has now been fully observed in one 
cluster as can be seen in Figure 18. It has also been observed in a rapidly 
growing number of rich clusters, at least at ne frequency, in ground based ob- 
servations, see Raphaeli (1995) and Birkinshaw (1999) for reviews [16,135]. 
The high angular resolution, sensitivity and frequency coverage of Planck 
HFI should lead to the detection of the SZ effect in several thousand clusters 
of galaxies (see below) . 

If the cluster has a peculiar motion with respect to its local standard of 
rest, the Compton interactions will also introduce a ST type distortion in the 
direction of the cluster which has already been discussed in the Section 3.2 
devoted to secondary anisotropies. At the level of sensitivity of Planck- 
HFI, this effect should lead to valuable statistical information on cluster 
peculiar motions (Haenelt et al. 1996; Aghanim et al. 1997 [4,75]). 
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Fig. 18. First Observation of the full SZ effect (i.e. including the positive side 
at high frequency) by Lamarre et al. in 1998. The fugure combines data from 
ISO-PHT (*), IRAS (x), PRONAOS (empty squares), SuZIE (Diamonds), and 
Diabolo (A). The solid line shows the best fit of a model including the contribu- 
tions of: foreground dust (dash line), positive and negative parts of the thermal 
SZ effect (dash-dot line), and kinetic SZ effect (dash-dot line in the insert). The 
parameters of the fit are: dust temperature Ta = 14.8 ± 1 K, comptonisation pa- 
rameter in the direction of the cluster centre 17 = 3.42(-|-0.41, —0.46)10“^, cluster 
peculiar velocity Vpec = 975(4-812, —971) km s“^ which can also be interpreted 
as a negative CMB fluctuation of — 119(-|-99, —118) /iK. Reprinted from [111]. 



Figure 19 shows an simulated exemple of maps of the thermal and ki- 
netic effect, based on an improved version [124] of the model by Aghanim 
et al. [3]. More precisely, cluster number counts were derived from the 
Press-Schechter mass function [126] and normalised using the X-ray tem- 
perature distribution function derived from Henry and Arnaud [83] data 
as in Viana and Fiddle [163] (assuming spherically symmetric gas profiles). 
Realisations were obtained by laying down clusters at random in each red- 
shift slice, according to the counts, by assigning random bulk velocities to 
each, and by summing all redshift slices. Such maps can then be used for 
detailed simulations of component separation (see Sect. 7.3 below) as well 
as to obtain the power spectrum of the fluctuations due to the thermal SZ 
effect. 





148 



The Primordial Universe 




Fig. 19. Theoretical simulations of the thermal and kinetic SZ effect from rich 
clusters of galaxies. The pictures show a 10° x 10° patch of the sky at 300 GHz. 
The temperature scale is in /iK. Reprinted from [124]. 



For the standard CDM model, it was found that the y fluctuations spec- 
trum is well-fitted in the range 20 < f < 4000 by 

+ 1) Cl = Qyaz £[£ + bysz £] , (4-8) 

with Gysz = 4.3 X 10“^^ and bysz = 8.4 x 10“^. Of course, the values of 
the fitting parameters in equation (4.8) depend on the assumed cosmological 
scenario. For an open model (Oq = 0.3), one finds instead Uysz = 4.6 x 10“^^ 
and bysz = 2.2 x 10“^. These are quite small differences, as illustrated 
by Figure 20. This is natural since most of the contributions come from 
relatively low redshifts while the counts have precisely been normalised via 
2 ~ 0 observations. Equation (4.8) for standard CDM yields 

ecy^ = 0.27 [£{1 + 8.4x10-'^ fiK (4.9) 

for the temperature fluctuation spectrum at 100 GHz. 

On small angular scales (large 1), the power spectrum of the SZ thermal 
effect exhibits the characteristic dependence of the white noise. This arises 
because at these scales the dominant signal comes from the point-like unre- 
solved clusters cores. On large scales (small 1), the contribution to the power 
comes from the superposition of a background of unresolved structures and 
extended structures. The transition between the two regimes occurs for 
I ~ 1/bsz that is when the angular scale is close to the pixel size of the 
simulated maps. One should note though that this approach neglects the 
effect of the incoherent superposition of lower density ionised regions like 
filaments. While the contributions in the linear regime are easy to compute 
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Fig. 20. Power spectra of the Sunyaev-Zeldovich thermal effect for different 
cosmological models: standard CDM (solid line), open CDM (dashed line) and 
lambda CDM (dotted line) are shown. 



(Persi et al. [123]), the non-linear contributions must be evaluated using 
numerical techniques (Persi et al. 1995; Pogosyan et al. 1996 [123,125]). 
While the modelling above is quite sufficient for the purposes of this pa- 
per, detailed simulations of the component separation should benefit from 
simulated maps of the SZ effect with more realistic low contrast patterns. 

4.3 Putting it all together 

In this section, we summarise a simple sky model based on discussions 
above in order to compare the various contributions in the spectral range 
relevant to CMB experiments. This model was initially developed during 
the preparation of the scientific case for Planck [14,26, 124]. 

4.3.1 A simple sky model 

Galactic emissions Our purpose justifies using a simplified galactic de- 
scription which is only appropriate for the low column density, high galactic 
latitude part of the sky (the best 50% from a CMB point of view). As a 
result, the dust spectral behaviour is modelled as a single temperature com- 
ponent with Td = 18 K and emissivity. The spectral behaviour of the 
synchrotron emission is assumed proportional to while the rest of the 

emission is assumed to behave according to cx 

As was shown above, the angular power spectra C(l) of the galactic com- 
ponents all decrease strongly with £, approximately as C(£) oc £~^. Smaller 
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Contributions at 9.9 mm ( 30.3 GHz) 
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Fig. 21. Contributions to the fluctuations of the various components, as a function 
of angular scale, at different frequencies. The thick solid black line corresponds to 
the CMB fluctuations of a COBE-normalised CDM model. The dots, dashes and 
dot-dashes (in red, blue, and green) refer respectively to the HI correlated, HI 
uncorrelated, and synchrotron emissions of the galaxy. The light blue triple dots- 
dashes displays the contributions from unresolved radio sources, while the purple 
line corresponds to the unresolved contribution from infrared sources. Long dashes 
show “on-sky” noise level (see text, Eq. (4.15)). On the 30 GHz plot, the highest 
noise level corresponds to MAP, and the lower one to the LEI. On the 100 GHz 
plot, the highest noise level corresponds to the LEI, and the lower one to the HFI. 
At 217 GHz, the noise levels are those of BolobBall (upper curve) and the HFI, 
while on the last plot only the HFI noise level is shown. Reprinted from [27]. 



angular scales thus bring increasingly small contributions per logarithmic 
interval of i to the variance, l{i + l)C(^): the galactic sky get smoother on 
smaller angular scales. 

To set the normalisation constants of the angular spectra, Bouchet and 
Gispert simulated the Galactic emission using the spectral behaviours 
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Contributions at 3.0 mm (100.0 GHz) 




Fig. 22. Contributions to the fluctuations of the various components at 100 GHz 
compared with the difference between CMB spectra differing by a 2% variation 
in Ho (all other parameters being kept equal). Reprinted from [27]. 



mentionned above and the Haslam and Dirbe maps as spatial templates. 
They further assumed that the free-free like emission (either due to real 
free-free or arising from spinning dust grains) could be split in two parts, 
one correlated with the HI emission, and one part uncorrelated with it. 
They find (see [27] for details) that 

(4.10) 

with cx given by Csync = 2.1, Cfree = 13.7, and Cdust = 13.5 at 100 GHz. 
Alternatively, one has chi-u = 8.5 and chi-c = 20.6 for the uncorrelated 
and correlated (respectively) with HI. In the following we shall use the 
representation in terms of the supposedly spatially indepedent H I correlated 
and uncorrelated part rather than the dust and free-free-like representation. 
Note that these normalisations are only appropriate by construction for 
intermediate and small scales, about £ ^ 10. 

Extragalactic contributions In order to model the unresolved back- 
ground from sources, we combine equations (4.3), (4.4), and (4.5) from the 
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Toffolati et al. and Guiderdoni et al. models 






7.1 10 ® / 0.16\ sinh^x 

ga:/2.53 — 1 



+ 




10"® (0.8-2.5x + 3.38x^) 



sinh^ X 



2 



£ 



5.7 sinh^(:^/113.6) 

(l//1.5)(4-75-0.1851og(i//1.5)) 



2 

mK. 



(4.11) 

(4.12) 

(4.13) 



For the SZ spectrum, we use equation (4.8) 

£C{£f/‘^ = (o.27 [£ (1 + 8.4 X 10"^^)] GHz)/rK, 

(4.14) 



where the SZ spectral behaviour, /(i^), obtains from equation (4.7). 



4.3.2 Detector noise “backgrounds” 

In the course of the measuring process, detector noise is added to the total 
signal after the sky fluctuations have been observed, which can be described 
by a convolution with the beam pattern. In order to directly compare the 
astrophysical fluctuations with those coming from the detectors, it is con- 
venient to derive a fictitious noise field “on the sky” which, once convolved 
with the beam pattern and pixelised, will be equivalent to the real one [100]. 

Modelling the angular response of channel i, Wi{9), as a Gaussian of 
FWHM 9i, and assuming a white detector noise power spectrum, the sky 
or “unsmoothed” noise spectrum is then simply the ratio of the constant 
white noise level by the square of the spherical harmonic transform of the 
beam profile 



= cLiseSxp - 



2{£i + ^r 



— ^noise 



2V2i^J ’ 



(4.15) 



with 



= 2 sin 



2V8 In 2 



(4.16) 



and chaise = = O’? X 27t[ 1 — cos(6*i/2)], if ai stands for the 1 — cr AT /T 

sensitivity per field of view. The values of Cnoise used in the plots for different 
experiments of interest can be found in Table 3. 

While convenient, the noise description above is quite oversimplified. 
Indeed detector noise is neither white, nor isotropic. The level of the noise 
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Table 3. Summary of experimental characteristics used for comparing experi- 
ments. Central band frequencies, v, are in Gigahertz, the FWHM angular sizes, 
0FWHM, are in arcminute, and AT sensitivities are in fiK per 6fwhm x ^fwhm 
square pixels; the implied noise spectrum normalisation Cnoise = AT(flFWHM)^^^, 
is expressed in /rK.deg. 
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depends in particular on the total integration time per sky pixel, which 
is unlikely to be evenly distributed for realistic observing strategies. In 
addition, below a (technology-dependent) “knee” frequency, the white noise 
part of the detector spectrum becomes dominated by a steeper component 
(typically in 1//, if the frequency / is the Fourier conjugate of time). This 
extra power at long wavelength will typically translate in “stripping” of 
the noise maps, a common disease. Of course, redundancies can be used to 
lessen this problem or even reduce it to negligible levels if the knee frequency 
is small enough (see Wright 1996; Janssen 1996; Delabrouille 1998 [44, 92, 
169] and Sect. 6.5 below). 

4.3.3 Comparing contributions 

Figure 21 compares at 30, 100, 217 and 857 GHz the power spectrum of the 
expected primary anisotropies (in a standard CDM model) with the power 
contributed by the galactic emission (Eq. (4.10)), the unresolved background 
of radio and infrared sources (Eqs. (4.3, 4.4, 4.5)), the S-Z contribution from 
clusters (Eq. (4.8)), and on-sky noise levels (Eq. (4.15)) corresponding to 
the MAP and Planck missions. It is interesting to note that even at 
100 GHz, the dust contribution might be stronger than the one coming 
from the synchrotron emission, at least for levels typical of the best half of 
the sky. Since point source processes have flat ( “white noise” ) spectra, their 
logarithmic contribution to the variance, £{£ + 1)C{£) ex increases and 
becomes dominant at very small scales. 

These comparisons suggests that future space experiments should be ca- 
pable of measuring the GMB power spectrum with very high precision. 
On the other hand, the values of the cosmological parameters are pre- 
cisely encoded in small variations of the shape of the GMB spectrum. 
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Ignoring foregrounds, Planck could in principle measure the main cosmo- 
logical parameters with a few per cent accuracy for fluctuations generated 
during an inflationary phase. In order to better see what this entails, we 
compare in Figure 22 the 100 GHz foreground contributions with the dif- 
ference between the CMB power spectrum of Figure 21 and the CMB spec- 
trum when the Hubble constant has been set to 51 km s“^/Mpc instead of 
50 km s“^/Mpc, all other parameters being maintained equal. Clearly at 
this level of precision, the foregrounds issue becomes of paramount impor- 
tance and a quantitative analysis becomes necessary. 

Given the power and frequency spectra of the microwave sky model, one 
can then compare the rms contributions per beam at any frequency Vi for 
any experiment by 



= 5^(2^ + 1) (4.17) 

i 

where Wi stands for the transform of the beam profile at the frequency 
Figure 23 gives an example for the Planck case. 

5 Observations of CMB anisotropies 

5.1 From raw data to the physics of the early Universe 

Measurements of the CMB are quite unique in the ensemble of astrophysical 
observations that are used to constrain cosmological models. They have the 
same character as fundamental physics experiments. 

On the one hand, as discussed in Section 3.1, for a given cosmological 
model (defined set of cosmological parameters and fundamental physics pre- 
dicting a well defined outcome of the early universe) the cosmic background 
radiation anisotropies after recombination can be computed very accurately. 
This is due to the linear and close to thermal equilibrium character of the 
physics involved in the transfer function for the radiation from the early 
universe to the time of recombination. 

On the other hand, as was seen in Section 4, the GMB anisotropies 
dominate the structure of the sky at frequencies around 100 GHz and makes 
up about 90% of the RMS fluctuations at this frequency. 

This leads to the following strategy in the use of the GMB anisotropy 
measurements as the most powerful observational tool to constrain funda- 
mental physics at very high energy. Observers will aim at producing a very 
high sensitivity map of temperature fluctuations of the GMB over most of 
the sky with very well controlled noise properties and systematic effects. 
This will be possible, starting from high accuracy sky maps measured at 
many frequencies, and using the essential property of the GMB primary 
anisotropies that their spectrum is known: the derivative of the Planck 
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Fig. 23. Standard deviation per beam versus frequency for all the relevant model 
components in the Planck case. Reprinted from [27]. 



function with respect to temperature. With maps taken at a large enough 
number of well chosen frequencies, we’ll show that one can extract a CMB 
anisotropy map separating it from galactic and extragalactic foregrounds. 

The confrontation of the set of predictions from cosmological models 
(including the relevant fundamental physics) is not done directly with the 
observations but with a much more elaborate product: the CMB tempera- 
ture anisotropies map (and it’s statistical characterisation, for example the 
power spectrum). 

The two processes: going from fundamental physics and cosmological 
parameters to a predicted CMB map on one hand and going from raw data 
to CMB map on the other hand are summarised in Table 4 and Figure 24. 
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Table 4. From the early Universe to the CMB anisotropies. 



Redshift 

5x10* 

10' 

3x10^ 


Early 

Universe 

Nucleosynthesis 

CMB Formation 

Radiation 

Dominated 

Era 

Matter 

Dominated 

Era 


# Adiabatic 
# Isocurvature 
+ Topological defects 
^ Pj,Pb,Pct)M,Pv perturbations 

Bose-Einstein and p fluct. 

Development of linear 
fluctuations under Scf) 


W 


Recombination 


Sachs- Wolfe 










Integrated SW 


Neuti 


•al 




First non-linear 


and lensing 


Universe 




structures 








6-10 


Reionisation 




Inhomogeneous 










Reionisation 




1 


Clusters 










of Galaxies 




Kinetic SZ 


Thermal SZ 


0 




ST pert. 


y 










distortion 



5.2 Observational requirements 

There is currently no need for absolute photometry of the uniform compo- 
nent of the emission of the sky which has already been measured by the 
COBE satellite. The observation strategy is to measure the anisotropy 
of the CMB radiation, i.e. brightness structure of the sky, over the fre- 
quency bands where contamination from foreground sources is at a min- 
imum and the CMB signal is at a maximum. Emission from foreground 












F.R. Bouchet et al.: The Cosmic Microwave Background 



Ground Call br^ lons 



Maps o< co-added data 



the Instrument 



Systematics 



[subtraction 



Four Low Frequency 
('alibrated Maps 
from LFI 



Six Frequency Q*nd 
Calibrated Jnaps 



Data 

from former 
Experiments 



^ — Component separation 



catalogues 



Source catalogues 



C(l) Analysis aAd j 
other detailed Tests o: 
Cosmological Miiid^s 



Fig. 24. From raw data to maps. This illustrate the main steps in the transform- 
ing the TOD in useful products for theoretical comparisons. One goes iteratively 
to maps at each frequency then to maps and catalogs of astrophysical sources and 
to their characterisation (number counts, power spectra...). Reprinted from [124]. 



contributions (from the Galaxy and extra-galactic sources) will be estimated 
and removed from the sky maps by measuring the spectral signature of the 
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fluctuations over a wide frequency range. The critical cosmological infor- 
mation is contained within the cleaned temperature anisotropies map, while 
the knowledge of foreground emissions is in itself a major scientific product. 
The measurement of polarisation in some bands is also of major interest, 
since it gives additional and unique data on the CMB that can be inter- 
preted either independently from the intensity measurement or through its 
correlations with intensity. 

The observational requirements for CMB measurements are closely re- 
lated to the nature of the physical processes that we want to unveil: 

1. We want to measure quite tiny relative temperature fluctuations (the 
needed AT/T accuracy is close to 10“®). As a consequence, we aim 
at getting maps of the sky which are photon noise limited at the 
frequencies where the CMB dominates (100 to 200 GHz); 

2. We are interested in the statistics of these fluctuations, the simplest 
of which being, the (angular) Power Spectrum, C{£). If the observed 
fraction of the sky, /sky, is too small, the statistics of the observation 

1 /Q 

will not be well determined (AC is approximatively oc ). In order 
to retrieve precise physics from the measurement, they must include 
a large number of independent samples of the fluctuations, i.e. most 
of the sky; 

3. The information we need is mostly contained in the first 3000 f-modes 
of the decomposition into spherical harmonics. This corresponds to 
angular resolutions of about 4 arcmin. Higher angular frequencies 
should be completely dominated by foreground point sources (and 
secondary effects); 

4. At the same time, strong sources (such as the Galaxy) far from the 
optical axis will contaminate the measurement. Very low optical side- 
lobes are needed up to large angular distance from the main beam to 
keep the level of contamination comparable to or smaller than the in- 
strumental noise. Such very low side lobes (rejection of 10^^ or better 
at large angles for frequencies larger than 300 GHz) cannot be fully 
measured on the ground and upper limits or measurements must be 
extracted from the data themselves with the required accuracy; 

5. Unidentified sources of noise can be confused with the cosmological 
signal. The best way to flght these noises is to get data with a high 
level of redundancy. This high level of redundancy is also used to 
identify any systematic effect including the side lobes signal. 

Requirements 1 to 3 are difficult to meet altogether. Small beams and large 
sky coverage imply a large number of pixels (4 x 10^ measurements for the 
full sky with 5 arcmin beams sampled ~ 2.5 times per beam). For a given 
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duration of the observation, this limits the possible integration time per 
pixel, and thus the map sensitivity. In the same time, the RMS value of the 
expected signal falls rapidly at high £ numbers. 

It can be shown that for a given instrument and a given duration of 
the observation, there are an optimal sky coverage and an optimal angular 
resolution [152]. They correspond roughly to the case when the contribu- 
tions to the RMS which come from the signal and the detector noise are 
about equal at the scale of the elementary map pixel. One can see from 
Figure 21 that this is indeed the case for the 5 arcmin channel at 217 GHz 
of Planck-HFI (signal and noise Ce cross at ^ ~ 2000). A higher signal 
to noise, while not statistically optimal, might nevertheless help with the 
identification and removal of weak systematic effects. 

Requirement 5 implies a need for redundancy at many different time 
scales, which means that each measurement has to be made many times, in 
as varied conditions as possible. 

5.3 Reaching the ultimate sensitivity 

Up to now, instrument sensitivity and/or the observation time have severely 
limited the scientific return from observation. A new generation of instru- 
ments can now have its sensitivity limited by fundamental processes, such as 
the photon noise from the CMB itself. This is expected to bring a revolution 
in the quality of the CMB data. These instruments use for detectors bolome- 
ters cooled at sub-Kelvin temperatures, read in total power thanks to very 
stable pre-amplifiers and placed behind low emissivity off axis telescopes. 
Such instruments are already used for CMB observations from balloon borne 
platforms (BOOMERanC, MAXIMA, ARCHEOPS). They have instanta- 
neous sensitivities 3 to 5 times worse than the ultimate ones expected from 
satellite experiments and thus go a long way in proving the feasibility of 
the latter. The shorter integration time of these balloon borne experiments 
with respect to satellite experiments (a factor from 100 to several hundred) 
put their sensitivity at a factor 30 to 50 from the ultimate ones. 

The first results of two of these experiments are given by De Bernardis 
et al. (2000) and Lange et al. (2000) for BOOMERanC [41,112] and Hanany 
et al. (2000) for the MAXIMA experiment [77]. The BOOMERang and 
MAXIMA map may be seen in Figures 27 and 35 respectively. 

The amount of information that one can retrieve from the measurement 
of electromagnetic radiation is limited by the quantum fluctuations of the 
radiation itself (photon noise). For radiation of thermal origin photon noise 
is mainly due to the shot noise of photons at short wavelengths and photon 
interference at long wavelengths [109]. The ultimate sensitivity of an ideal 
experiment is the photon noise induced by the source itself, here by the 
CMB. Reaching this limit is possible only by meeting two main conditions: 
(i) detectors must be sensitive enough, even when loaded by the total power 
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Fig. 25. The power on the detectors of Planck-HFI originates from internal and 
external sources. At low frequencies (long wavelengths), photon noise from the 
CMB itself limits the sensitivity. 



from the CMB, and (ii) all sources of spurious radiation should be much 
smaller than the CMB. This concerns in particular the thermal emission 
of the telescope and instrument. It can be shown [110] that this can be 
reached in the millimetre range by cryogenic cooling of the instrument and 
by a good telescope design and a passive cooling of the telescope in orbit. 
This is important because antennas needed for arc minute resolutions at 
millimetre wavelengths are too large to be put inside realistic cryostats. 
At submillimetre wavelengths, the thermal radiation from the telescope be- 
comes the dominant source of background. An illustration of this situation 
in the case of the Planck project is given in Figure 25. 

Another critical aspect of high sensitivity mapping instruments is the 
strategy used to build full maps from data obtained by scanning the sky 
with limited fields of view. Low frequency noises will produce structures 
in the map along the scanning direction. The example of the IRAS maps, 
where “striping” is visible even after careful processing, show how difficult it 
may be to remove such features. All types of detectors naturally suffer from 
low frequency noises originating in the “1//” noise of the electronics and/or 
the thermal fluctuations of all instrument parts (/ being is the frequency in 
a time domain Fourier analysis). 

The principle that has been chosen for the experiment COBE-DMR 
or by the MAP project is to use differential measurements between very 
distant parts of the sky, and to reconstruct a whole sky map by an ele- 
gant inversion method. An opposite solution, often called the “total power” 
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COBE-DMR Map of CMB Anisotropy 
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-100 juKH HI+lOO 

Fig. 26. The final (4 years) COBE-DMR map of the last scattering sur- 
face (Galactic polar caps). This map was obtained from the COBE web site 
http:/ /space. gsfc.nasa.gov/astro/cobe 



solution, has been adopted in the Planck focal plane instruments or in 
BOOMERanG or ARCHEOPS balloon borne experiments. Low frequency 
noises have been pushed to very low frequencies thanks to major improve- 
ments in the measuring systems. By using electronic modulation of the de- 
tector systems, the useful signal is shifted in frequency in a range where 1// 
noises become negligible. Then, the measurement is repeated periodically, 
and very low frequencies are removed by a proper method (see Sect. 6). 

The first detection of the CMB by Penzias and Wilson and the first 
detection of its anisotropies by COBE-DMR (see Fig. 26) have been made 
thanks to radio receivers in the millimetre range. On the other hand, the 
first extended high sensitivity map of the CMB has recently been obtained 
by BOOMERanG and MAXIMA (Figs. 27 and 35) by using bolometers in 
the millimetre/submillimetre range (A < 3 mm). The BOOMERanG map 
has ~ 60 000 pixels covering less than 5% of the sky (while DMR had 
~ 6000 pixels covering the whole sky). 

Bolometers are significantly more sensitive to CMB anisotropies than 
radio receivers. Unfortunately, they are more difficult to use because of the 
low operating temperatures needed. Getting 0.3Kor0.1Kina space exper- 
iment is a challenge which has never been met yet. In addition, electronics 
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Fig. 27. BOOMERanG high resolution map of 2.5% of the last scattering surface 
with a resolution of 12' (the lower right corner shows the size of the moon for 
comparison). Details may be found in [41], while this map comes from the web 
site of BOOMERanG (see www.physics.ucsb.edu/~boomerang/press_images from 
North- America and oberon.romal.infn.it/boomerang from Europe). 



with no 1// noises down to 10“^ Hz have been developed only recently. 
A good picture of how bolometers and cooled radio receivers compare can 
be obtained by considering the two focal plane instruments of Planck, 
HFI for bolometers, and LFI for radio receivers. With the current tech- 
nology, bolometers are more sensitive than the best radio receivers for all 
frequencies larger than 30 GHz. But due mainly to the technical difficulty 
of cooling big devices at 0.1 K, the frequency range of the HFI is limited to 
frequencies larger than 100 GHz {i.e. to wavelengths of 3 mm or less). 



5. 4 Presen t sta tus of observa tions 

GMB anisotropy experiments can be classified in three generations. The 
DMR experiment on GOBE got the first detection of the GMB anisotropies 
with a 7 deg beam and a signal to noise per pixel around 1. 
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The second generation experiments are on going or close to be operated. 
They have an angular resolution approaching 10 arcmin. The sensitivity is 
limited by uncooled detectors for most of them and by ambient tempera- 
ture telescopes for the most recent balloon borne bolometer experiments: 
MAXIMA, BOOMERANG, ARCHEOPS. These experiments are well 
within a factor of 2 of the photon noise of their ambient temperature 
telescope. 

The third generation will be the Planck mission which will have a low 
background provided by its 50 K passively cooled telescope large enough 
to provide 5 arcmin resolution and 0.1 K bolometers which will be photon 
noise limited. 

Many locations have been considered for CMB observations: ground 
based telescopes at many places in the world, included at South Pole, air- 
planes, stratospheric balloons, satellites around the Earth, and in the near 
future, satellites far from the Earth. The main parameters related to the 
location are the atmospheric absorption and emission and the proximity of 
sources of straylight. 

The Earth atmosphere is a very strong source at millimetre and sub- 
millimetre wavelengths [120]. It is therefore a source of photon noise. In 
addition, it is not perfectly uniform and the slowly changing structure of its 
emission adds noise that can be confused with CMB anisotropies. Even at 
stratospheric balloon altitudes (38 to 40 km), one may encounter structures 
at low angular frequencies that limit measurement stability. Only satellites 
are free of this first “layer” of spurious emission. Straylight coming from the 
Earth is a common problem for all locations except for space probes distant 
from the Earth as seen from the instrument. The brightness temperature 
of about 250 K of the Earth times its solid angle has to be compared with 
the instrument sensitivity in brightness times the beam solid angle. For a 
mission in low earth orbit the ratio is 6 x 10^^ and at the L2 Lagrange point 
the ratio is 5 x 10®. 

At the wavelengths of interest, diffraction is a major source of straylight, 
and modelling and experimentation can hardly estimate reliably or measure 
the very low side lobes needed to meet this extremely high rejection ratio. 
The problem is severe enough to have motivated the choice of the Lagrangian 
point L2 of the Sun-Earth system, at about 1.5 million kilometres from 
Earth, to locate the new generation of satellites MAP and Planck. 

5.5 Future satellite observations: MAP, Planck 

Two satellites, MAP and Planck, are currently in preparation for CMB 
observations. MAP is American, and Planck is mainly European. Both 
will be put in an orbit around the Lagrangian point L2 of the Sun-Earth 
system, to minimise parasitic radiation from Earth. Both are based on the 
use of off-axis gregorian telescopes in the 1.5 m class. 
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Fig. 28. Measurements of the power spectrum. The first plot shows all pub- 
lished detection at the end of 1996, while the second plot is an update at the 
end of 1999. The last plot shows the latest results published in may 2000 by the 
BOOMERanG and MAXIMA teams (with each curve moved by -|- or — 1 u of 
their respective calibration, see [77]). These spectra correspond respectively to 
the maps of Figures 27 and 35. 
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MAP has been designed for rapid implementation, and is based on 
fully demonstrated solutions. Its launch should take place around the end 
of the year 2000. It’s observational strategy uses the differential scheme 
(see Sect. 5.3). Two telescopes are put back to back and feed differential 
radiometers. These radiometers use High Electronic Mobility Transistors 
(HEMTs) for direct amplification of the radio- frequency (RE) signal. An- 
gular resolutions are not better than 10 minutes of arc. 

Planck is a more ambitious and complex project, which is planned to 
be launched at the beginning of 2007. It is designed to be the ultimate 
experiment in several aspects. In particular, several channels of the High 
Frequency Instrument (HFI) will reach the ultimate possible sensitivity, 
limited by the photon noise of the CMB itself. Bolometers cooled at 0.1 K 
will allow reaching this sensitivity and, at the same time, reach an angular 
resolution of 5 minutes of arc. The Low Frequency Instrument (LFI) limited 
at frequencies less than 100 GHz, will use HEMT amplifiers cooled at 20 
Kelvin to increase their sensitivity. The scan strategy is of the total power 
type. Both instruments use internal references to obtain this total power 
measurement. This is a 0.1 K heat sink for the bolometers, and a 4K 
radiative load for the LFI. The combination of these two instruments on 
Planck is motivated by the necessity to map the foregrounds in a very broad 
frequency range: 30 to 850 GHz. 

5.6 Description of the Planck High-Frequency Instrument 

We describe in more detail the Planck-HFI instrument which is the one 
reaching the ultimate sensitivity and which could do cosmologically mean- 
ingful polarisation measurements over most of the sky thus being what we 
defined above as the third generation GMB experiment. 

The HFI is a multi-band instrument with 6 bands spanning the 100 to 
857 GHz range. It is necessary that the HFI has enough pixels at each 
frequency in the cross scan direction to ensure proper sampling of the sky 
as the satellite spin axis is de-pointed in steps of 2.5 minutes of arc. 

The instrument include the capability to measure polarisation of the 
microwave emissions at several frequencies: 143 and 217 GHz (best GMB 
channels) and 545 GHz to the map dust emission polarisation. Further, the 
number of detectors per frequency also provide increased sensitivity and 
improved redundancy. This leads to a focal plane layout of 48 detectors. 

5.6.1 Instrument concept 

It is based on the use of bolometers cooled at 0.1 K, which are the most 
sensitive detectors for wide band photometry in the HFI spectral range. 
Bolometers are sensitive to the heat deposited in an absorber by the incident 
radiation. Very low temperatures are required to obtain a low heat capacity 
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Fig. 29. Schematic layout of the HFI showing its main parts and their 
temperatures. 



giving a high sensitivity with a short enough thermal time constant. The 
bolometers are thus the first critical item of the HFI. 

Cooling the detectors at 0.1 K in space is a major requirement that 
drives the architecture of the HFI. Furthermore, the HFI will be launched 
warm which allows the detectors to receive the radiation from the telescope 
without having to cross a window. It implies active coolers reaching 0.1 K. 
This is achieved, starting from the passively cooled ~ 50 K stage of the 
payload module, by a four-stage cooling system (18 K — 4 K— 1.6 K — 0.1 K). 
The 18 K cooler is common to the HFI and the LFI. 

The 4 K stage protects the inner stages from the thermal radiation 
of the 18 K environment. It provides also an electromagnetic shielding 
(a Faraday cage) for the high impedance part of the readout electronics. 
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It is the envelope of the HFI focal plane unit. The coupling of the telescope 
with the detectors is made by back-to-back horns attached on the 4 K stage, 
the aperture of the waveguides being the only radiative coupling between 
the inside and the outside of the 4 K box. Filters are attached on the 1.6 K 
stage, and bolometers on the 0.1 K stage, which corresponds to an optimal 
distribution of heat loads on the different stages. 

The HFI focal plane unit has an extension to the 18 K and 50 K stages, 
enclosing the first stage of the pre-amplifiers (J-FETs at 120 K). 

5.6.2 Sensitivity 

The ultimate limitation to the sensitivity of radiometers is the quantum 
fluctuations of the radiation itself, i.e. photon noise of the flux reaching the 
detector, ideally only that from the observed source. HFI is designed to 
approach this ideal limit. The signal to noise ratio obtained by one detector 
after an integration time t can written as a first approximation as: 

^ ^^signal 

TV “ NEP(2t)-l/2 + Wsystematics 

where IFsignai is the power absorbed by the detector, after transmission 
by the optical system, NEP is the Noise Equivalent Power of the detection 
system, including intrinsic detector noise, photon noise, and spurious signals 
and IFsystematics IS the power associated with a fraction of the systematic 
effects, such as spin synchronous variations of straylight and temperature, 
that cannot be taken away in the data reduction process (most of them are 
removed using the redundancy in the data). 

The photon noise on the HFI detectors originates mainly from the CMB 
for A > 1.5 mm, and mainly from the thermal emission of the telescope at 
shorter wavelengths. A colder telescope improves the sensitivity at high fre- 
quencies. At low frequencies, the HFI is designed to approach the quantum 
noise of the CMB itself. 

An instrument approaching the theoretical sensitivity limit must meet 
severe requirements in several domains: 

(i) the detectors intrinsic noise must be small with respect to photon 
noise. The current results obtained with spider-web bolometers give 
intrinsic NEPs equal to or less than photon noise; 

(ii) the efficiency of the optical system must be high; an overall efficiency 
better than 30% can be achieved; 

(iii) The stray light must have negligible impact on the measurement. The 
horns are optimised to get the maximum directivity compatible with 
the stray light requirement; 
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Table 5. HFI sensitivities. 



Central frequency {i>) 


GHz 


100 


143 


217 


353 


545 


857 


Beam Full Width Half Maximum 


arcmin 


9.2 


7.1 


5.0 


5.0 


5.0 


5.0 


Number of unpolarised detectors 




4 


3 


4 


6 


0 


6 


ST/T Sensitivity (Intensity) 


yK/K 


2.0 


2.2 


3.5 


14.0 


147 


6670 


Number of polarised detectors 




0 


9 


8 


0 


8 


0 


5T/T Sensitivity {U and Q) 


yK/K 




4.2 


8.1 




140 




Total Flux Sensitivity per pixel 


mJy 


10.2 


12.6 


9.4 


19.4 


38 


43 


ysz perFOV(xlO^) 




1.11 


1.88 


547 


6.44 


26 


600 



(iv) the time response, the noise spectrum, and the detector layout must 
be consistent with the sky coverage strategy. While the instrument 
is scanning the sky, angular frequencies along the observed circle are 
detected as time frequencies. The system must be able to detect all the 
useful frequencies, from 0.016 Hz to nearly 100 Hz. This require new 
type of readout electronics that has been developed for this purpose; 

(v) in addition, other sources of noise, such as those induced by ionising 
particles or electromagnetic interference must be kept negligible. 

Once this sensitivity limit has been reached, the only ways to increase the 
accuracy of the measurement are to increase the number of detectors and/or 
the duration of the mission. 

Table 5 gives the mean sensitivity per pixel for a mission duration of 
14 months. Intrinsic bolometer noise is assumed to be equal to photon 
noise. Pixels are assumed to be square (side = beam Full Width at Half 
Maximum) . AT /T sensitivity is the noise expressed in CMB temperature 
relative change (Icr). ysz is the sensitivity (Icr) to the comptonisation factor 
for the Sunyaev-Zeldovich effect. 

The beam patterns on the sky are nearly Gaussian and well defined 
by their full width half maximum. For all channels, spectral resolution is 
izjAiy = 3.34, and the total optical efficiency, including the losses in the 
telescope, is assumed to be 0.32. The sensitivity given in Table 5 is relevant 
for a uniform extended source (or a point source for the flux sensitivity) 
varying with time scales long enough to avoid damping by the bolometer 
time constant and short enough not to be in the domain of 1 // noise. 
Deviation from this ideal case result from the following effects: 

• Spatial frequencies are not transmitted uniformly by the optical system. 
The Modulation Transfer Function filters high spatial frequencies. The 
goal is to optimise angular resolution keeping the straylight at an accept- 
able level; 
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• High frequencies are also filtered out by the bolometer time response. Im- 
plementing fast enough bolometers is a requirement for such an 
instrument; 

• The sensitivity at low frequencies may be degraded due to 1// noise of 
the detection system of the electronics. In consequence, the temperature 
stability of the 0.1 K stage, and the readout electronics are required to 
show no excess fluctuations down to 0.016 Hz, i.e. the frequency of the 
1 RPM scanning. 

5.6.3 Focal plane optics 

Architecture To detect anisotropies at a level of 1 part in 10® in the 
CMB it is essential that the sensitivity of the HFI to unwanted energy is 
minimised. A high rejection ratio must be achieved in the angular domain, 
i.e. for scattered and diffracted waves. For that purpose, the field of view 
of the HFI detectors is determined by naked horns in the focal plane which 
have a well determined angular response thereby allowing control of the 
stray fields whilst coupling efficiently to the wanted energy. Since most 
sources of spurious radiation have a steep spectrum, a high spectral rejec- 
tion is also mandatory. In addition, the radiation of the filters themselves 
on the detectors, and the load from warm parts on the cryogenic stages 
must be kept small (a few nW on the 0.1 K stage). Complying with these 
requirements is achieved by a wide use of high performance interference fil- 
ters, and by an original architecture tightly coupling optical and cryogenic 
designs (Fig. 30). 

By using back-to-back horns at 4 K a beam waist is produced at the 
1.6 K level where spectral filters are placed to define the detected band. A 
third horn re-images radiation onto the bolometric detector. This design 
naturally offers thermal breaks between the 100 mK detectors, the warmer 
1.6 K filters and the focal plane horns at 4 K. 

Figure 30b shows a schematic of the proposed HFI focal plane that 
optimises the use of the available focal plane space. The limitation of the 
number of horns comes from a thermal and mass limitation imposed by the 
cooler and from the requirement to share the focal plane with the LFI. 

Coupling to the telescope - horn requirements The philosophy be- 
hind the scheme for the focal plane horns is influenced by a number of 
specific requirements peculiar to the Planck Mission. 

To obtain the necessary resolution with low spillover properties, a conical 
corrugated horn design has been chosen for the feeds. The 100, 143, 217, 
and 353 GHz horns are single moded, and so produce coherent diffraction 
limited beam patterns. 

The specific horn design is a compromise driven by the straylight require- 
ments, which are quite exacting, and the goal of optimal angular resolution 
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Fig. 31. Schematic of optical layout for a single HFI pixel with, at 0.1 K (left), 
the bolometer, its horn, and its filters, at 1.6 K (centre) filters, and at 4 K (right) 
filters and back-to-back horns. 



on the CMB given the finite size of the telescope. To reduce the stray light 
contamination by the Galaxy, the total integrated spillover power at the 
primary has to be minimised. Conservative values for acceptable spillover 
levels for the different frequency channels vary between 0.6% at 100 GHz to 
0.2% in the higher frequency bands. 

A drawing of a prototype 100 GHz horn assembly used for testing the 
proposed instrument performance is shown in Figure 31. It is possible to 
improve the design of the horns by shaping the generator or the cone to a line 
more complex than a straight line. With the appropriate shaping, one can 
improve both the spillover and the illumination of the main mirror, finally 
improving the angular resolution on the sky while decreasing stray light, the 
edge taper and spillover levels obtained with such horns are listed in Table 6. 

The feed horns are part of a back-to-back dual horn structure connected 
via a waveguide which controls the number of modes that can propagate. 
For the 100, 143, 217 and 353 GHz channels the waveguide allows for 
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Table 6. Performance of the shaped and flared horns used the Planck HFI 
design. 



i/(GHz) 


Spillover(%) 


Edge taper(dB) 


FWHM(arcmin) 


100 


0.6 


-25 


9.2 


143 


0.4 


-28 


7.1 


217 


0.2 


-35 


5 


353 


0.1 


-35 


5.0 


545 


0.1 


-30 


5.0 


857 


0.1 


-30 


5.0 



0.0 




Fig. 32. Far field beam pattern for HFI 143 GHz feed. 



propagation of the fundamental mode only (one or both polarisations). A 
design suitable for these channels, in which there is a transition from a 
corrugated to smooth wall within the flared section of the horns, and which 
has very low return loss, is shown in Figure 31. The corresponding far-held 
beam pattern is shown in Figure 32. 

For the higher frequencies (353 GHz, 545 GHz and 857 GHz) the angular 
resolution requirement does not demand diffraction limited operation for the 
required spillover levels. Few moded horns will therefore be utilised in these 
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cases to increase the throughput and coupling to a wider beam on the sky. 
Few moded operation is obtained by increasing the waveguide diameter and 
allowing higher-order waveguide modes to propagate. Because of the wide 
bandwidth the number of modes and, thus, the narrow band-beam pattern 
will vary across the full 25% bandwidths of the detectors; the integrated 
pattern, therefore, will be carefully modelled, and measured to ensure a 
well-understood design. 



Wavelength selection, filters The Planck submillimetre telescope is 
a simple off-axis Gregorian with a primary aperture of ~ 1.5 metres. This 
design ensures that there are no support structures in the beam, which could 
otherwise cause diffraction of the sky beam or radiate unwanted power to 
the detectors. The telescope will be a low emissivity one (expected to be 
< 0.5%), passively cooled, together with its enclosure, to about 50 K to 
minimise the thermal power radiated to the HFI detectors. Optically the 
telescope system is equivalent to a single parabolic mirror with an effective 
focal length of 1.8 metres, which focuses the sky radiation onto bolometric 
detectors located inside the HFI module. The rejection of the broadband 
emission from the sky and telescope requires a sequence of filters to guar- 
antee the spectral purity of the final measurements. Currently the spectral 
bands are defined by a combination of the high pass waveguide cut-on be- 
tween the front back-to-back horns and a low pass metal mesh filter cut-off. 
Because of the requirements to minimise harmonic leaks there are four ad- 
ditional low pass edge filters such that the overall rejection exceeds 10^° 
at higher frequencies. The measured spectral performance for a prototype 
143 GHz band filter set is given in Figure 33. The characteristic is shown 
for each filter along with the overall system transmission. As can be seen, 
the overall filter transmission is about 55% while the rejection increases 
from 10^° to 10^^ not accounting for the cut-off of the final filter. These 
characteristics fulfil the spectral purity requirement for HFI channels. 

5.6.4 Bolometric detectors 

The HFI sensitivity requirements have been determined from the funda- 
mental constraints of photon noise originating from the 3 K CMB radiation 
itself at the longer wavelengths and the residual emission from the telescope 
and instrument at the shorter wavelengths. Specifically then, the Planck 
HFI bolometric detectors require inherent NEPs of less than or equal to 
the quadratic sum of the noises from the background components together 
with a speed of response fast enough to preserve all of the signal informa- 
tion at the 1 RPM scan rate of the satellite. Detectors, which provide the 
required sensitivity and response speed are the CalTech/ JPL spider bolome- 
ters. Table 7 summarises of the required bolometer parameters along with 
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Fig. 33. Plot of Prototype 143 GHz Channel Spectral Response (horizontal axis 
in GHz). 



the results of measurements on a prototype device, CSK18, whose time 
constant and NEP are close to those needed for the 100 GHz channel. 

As is evidenced in Table 7, the silicon nitride micromesh ( “spider web” ) 
bolometer technology developed for Planck will provide background 
limited performance in all bands. The radiation is efficiently absorbed in a 
conducting film deposited on a micromesh absorber, which is thermally iso- 
lated by radial legs of uncoated silicon nitride that provide rigid mechanical 
support with excellent thermal isolation (see Fig. 34 for details). 

The temperature of the absorber is measured by a small neutron 
transmutation doped (NTD) Ge thermistor that is indium bump-bonded 
to the absorber and readout via thin film leads that are photo-lithographed 
on two of the radial legs. Gompared to a solid absorber, the micromesh 
has a geometric filling factor of approximately 1.5%, providing a corre- 
spondingly small suspended mass, absorber heat capacity, and cosmic-ray 
cross-section. The lithographic techniques used to fabricate the detectors 
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Table 7. Requirements on detectors with assumptions: (1) the satellite has a 
1 RPM scan rate and we take 2 time constants (2r) per beam; (2) the inherent 
detector noise ~ 10 nV /V Hz and the ampliher noise ~ 5 nV jy/ Hz\ (3) NEPboi = 
(NEp2h„„„„+NEPibn.on+NEPLp)^^"; (4) NEPtot = (NEPboi'+NEPphotcn")'/"; 
(5) the bolometer thermal conductance, G = MAX(100*Q, 0.56*C/t) pW/K. 



Frequ. 

GHz 


r 

msec 


Q 

pW 


G(est.) 

PJ/K 


G 

pW/K 


NEPboi 

xlO-i^ 


NEPphot 

xlO-i’’ 


NEPt/NEPp 


100 


4.6 


0.43 


0.46 


56 


0.82 


1.01 


1.29 


143 


3.2 


0.46 


0.39 


68 


0.90 


1.24 


1.24 


217 


2.1 


0.47 


0.34 


90 


1.04 


1.49 


1.22 


353 


2.0 


1.12 


0.36 


no 


1.16 


2.88 


1.08 


857 


2.0 


12.0 


0.58 


1200 


3.80 


14.6 


1.03 


















143p 


3.2 


0.23 


0.39 


68 


0.90 


0.88 


1.43 


217p 


2.1 


0.24 


0.34 


90 


1.04 


1.05 


1.41 


545p 


2.0 


1.87 


0.41 


190 


1.51 


4.66 


1.05 






















1 Measured performances 








CSK18 


4.5 




1 0.45 


1 56 


0.82 







ensure high reliability and reproducibility. Micromesh bolometers are cur- 
rently being used in numerous CMB experiments (BOOMERANG, 
MAXIMA, SuZIE, PRONAOS) which operate under similar optical loading 
and detector sensitivity requirements to those needed here. 

For all of the Planck bands, the optimum thermal conductivity between 
the absorber and the heat sink, and thus the NEP ~ (4 kT'^ G)^/^, is 

determined by the time constant requirement (conservatively taken to be 
less than half the beam crossing time) and the heat capacity. The micromesh 
architecture allows the thermal conductivity to be easily tailored to the 
optimum value for each band. Sensitivity is thus limited ultimately by the 
heat capacity of the device. In practice the polarised channels at 143 and 
217 GHz, where the backgrounds are lowest, provide the most demanding 
requirement but the data from the GSK18 prototype show that even these 
performances can be met. 

The estimated bolometer NEP is compared with the background- limited 
NEP in Table 7. For most channels, the bolometer NEP is significantly 
below the background limit. In the worst case, for the 143 and 217 GHz 
polarised channels, the detector and background-limited NEPs are equal. 
The last column in Table 7 compares expected performances to an ideal 
detector system. 
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Fig. 34. Prototype spider bolometer CSK18. Active absorber diameter, outer 
spider circle, is 5.675 mm. Inset shows NTD Ge sensor at the centre with the 
two thicker current carrying and thermal conductance control lines running out 
horizontally to electrical contacts on the silicon substrate. 



Because the resonant frequency of the micromesh bolometers is high 
(~ 50 kHz), the devices are insensitive to the relatively low frequency vi- 
brations encountered during the launch and operation. 

6 Extraction of systematic effects and map making 

6.1 Maximum likelihood estimators 

We now turn to the problems of analysing the collected data to set con- 
straints on parameters of a theoretical model of the data. 

Let D stand for some form of the data, e.g. a data vector which we 
picture as a collection of random variables®, and let T stand for some vector 
of theoretical parameters which we seek to constrain. We need to have some 
probability distribution or likelihood function, L{D\T), which tells us how 
likely is a particular experimental outcome given a set of parameters of the 
theoretical model. 



®The randomness stemming form the unavoidable presence of noise in the experiment. 
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In the following D might be the data flow from a detector and T the 
pixels values of a discretised sky map, or D might represents a collection of 
maps of the sky at different frequencies, and T might then be pixel values of 
templates of the emission of various astrophysical process, or else D might 
be a CMB anisotropies map, and T might then stand for its power spectrum, 
or the parameters of a model for the anisotropies {e.g. a dozen of inflation 
parameters and cosmological parameters) . 

We shall denote by T an estimator of T whose true value is T. Of course 
we wish that the estimator be unbiased on average, 

{f) = T, (6.1) 

and to be as precise as possible, i.e. to have minimal standard deviations 
on each parameter Ti, 

<JT, = {(T^) - ( 6 . 2 ) 

Such an estimator would then be the best unbiased estimator available. 

As summarised in [159], statisticians [56,97-99] have since long demon- 
strated a number of results explaining why maximum likelihood estimators 
are so useful: 

• if there is a best unbiased estimator, it is the one that maximises the 
likelihood L (or a function of it); 

• the maximum likelihood estimator is the best unbiased estimator in 
the limit of large datasets; 

• and most importantly, any unbiased estimator will obey the Cramer- 
Rao inequality 

ctt, > l/(F,,)i/" (6.3) 

where Fa are diagonal elements of the Fisher information matrix [56] 

/ d'^C \ 

-ith£^-lnL. (6.4) 

In addition, if the other parameters Tj are also determined from the 
data {i.e. they are not “marginalised over”), the minimal standard 
deviation will increase to 






(6.5) 



The best (maximum likelihood) estimator is then the one for which 
the equality in (6.3) or (6.5) is realized. 
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The Fisher matrix is thus very useful in quantifying how well a particular 
experiment may constrain some theoretical parameters and has been widely 
used in astronomy in general and to forecast the constraints that will be 
posed by future experiments like MAP and Planck. It also leads to the 
concept of lossless compression of the data [159], meaning any analysis which 
will reduce the size of the dataset while leaving the Fisher matrix invariant 
(and thus the precision which one may hope to achieve). 

6.2 Using noise properties 

Application of Bayes theorem to CMB data analysis is quite enlightening 
and we shall follow that presentation. This probability theorem states 

P{T\D)P{D) = L{D\T)P{T). (6.6) 

If as above D stands for some form of the data, and T for some “theory” 
which we seek to constrain, then the estimator of the theory, T is the one 
that maximises the posterior probability P(T\D) of the theory given the 
data. By using Bayes’ theorem, this amounts to maximise the product of 
the likelihood L{D\T) of the data given the theory by the prior P{T) (since 
the evidence P{D) is usually a mere normalisation constant). 

In order to be more specific, suppose the data are arranged as a vector 
of noisy measurements, y, which has two components, a signal, s, and some 
noise, b, so that 

U = y = s + b. (6.7) 

The assessment of L{D\T) can only be expressed if the statistical properties 
of the noise are known. 

We shall assume that the noise is Gaussian distributed, and thus obeys 
a multi-variate normal distribution 

P(b) oc exp-b'^B'^b, (6.8) 

where B = (bb^) stands for the covariance matrix of the noise (which we 
further assume to be uncorrelated with the signal). This enables writing 
L{D\T) via the difference between the data and the signal. 

It is often the case in this context that we can assume a linear relation 
between the signal and the theory, which we also arrange as a vector, x, 

y = Ax -I- b, (6.9) 

where A is the matrix describing how the observation of the “theory” is 
conducted (we shall describe a more complicated case below in Sect. 7.4). 
We thus have 

L{D\T) = P(b) oc exp -(y - Ax)'^B-^(y - Ax) = exp (6.10) 

Different solutions will follow depending on the assumed prior P(x). 
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6.2.1 Systematics 

Before we proceed, we stress that the theoretical description should he com- 
plete, i.e. y — Ax should really be noise! If there are any systematics, like 
an unknown source of signal u, one needs to incorporate it in the model, 
e.g. write b = y — Ax — u, with this systematics modelling potentially 
depending on unknown parameters which can also be obtain by maximis- 
ing P(x, u|y). Otherwise, we shall force the systematics into the estimated 
theory, so that the difference obeys Gaussian statistics; in other words, 
“Garbage In” implies “Garbage Out”... 

To identify and separate the systematic effects u from the sky signal, we 
need to have a measurement (sky scanning) strategy which offers enough 
redundancy. This is discussed in detail in Section 6.4. 

6.2.2 Priors 

• If we have no theoretical prior, then P{T) = cste] the Maximum 
Likelihood is then obtained by minimisation. The solution is then 
found to be 

X = Wcy, with Wc = [A'^B-i A] A^B~\ (6.11) 

where x denotes the estimator of the theory x. 

• If we can assume a Gaussian “theory”, then 

P{T) (X exp— x'^C'^x, (6.12) 

where C = (xx"^) is the covariance matrix of that theory. The poste- 
rior probability to maximise can then be written as 

P(x) (X exp [— x^(x) — x'^C~^x] (X [e'^E'^e] , (6.13) 

where 

e = X — X (6.14) 

stands for the vector of reconstruction errors and E = for the 

associated covariance matrix. The maximum likelihood solution is 
then a Wiener filtering of the data x = W-wy with 

Ww = [A'^B-^A H- C"1] A^B~^ (6.15) 

which differs from the above (Eq. (6.11)) only by the additional C~^ 
term in the “denominator” . Note that an exactly equivalent form 
mathematically {e.g. [153]) is 

Ww/ = CA'^ [ACA^ + B] 
where neither B~^ nor C~^ appears. 



(6.16) 
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• Of course, other priors will lead to other solutions. In particular, 
Hobson et al. [88] use an entropic prior^*^, based on information- 
theoretic considerations alone, which have had considerable success 
in radio-astronomy and many other fields involving complex image 
processing. Given an image, x, and a model, m. 



P(x) oc exp [aS'(x, m)j , (6.17) 

where a is a regularising parameter, and 5'(x, m) stands for the “cross- 
entropy” of X and m. The classical difficulty is that x should be 
a positive additive distribution, although this can be circumvented 
by considering the image x as the difference of two positive additive 
distribution, x = u — v. In that case 

5'(x,m„,m„)= ^ < ipj - niuj - rrivj - hj In 

pixels ^ 

(6.18) 




where ipj = [/i| -I- and m„ and are separate mod- 
els for u and v whose properties can be set from an assumed C. 
The Maximum Entropy Method (hereafter MEM) therefore reduces 
to minimising versus x the non-linear function 

4>mem(x) = x^(x) - aS'(x, m„, m„). (6.19) 



The value of the trade-off parameter, a, can itself be obtained self- 
consistently by treating it as another parameter of the hypothesis 
space. Finally, all the procedure above {i.e. <I)mem minimisation 
including the search for a) can in turn be looped over by using the 
current iteration to provide updated estimate of the correlation matrix 
C, and of the models m„ and m„... 



6.3 Map making 

We now focus on the map-making problem, i.e. on how to convert a time- 
ordered signal, or TOD, of length Nt from a detector to a map of the sky. 
In this case, y is the series of Nt values measured sequentially in time (the 
TOD) by a detector and x is a “channel” map of Np pixel values arranged 
as a vector according to some numbering scheme. 

We assume that the TOD has already been analysed in depth and 
cleaned, with all glitches {e.g. cosmic rays hitting the detector) removed. 



lOThe paper deals with the problem of the separation of components which we shall 
address below, but the formalism is exactly the same. 
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bad data recognised and excised and treated appropriately {e.g. replaced by 
a constrained noise realisation), and the TOD noise covariance matrix, B, 
well estimated, as well as the beam and the pointing solution {i.e. we know 
exactly in which direction of the sky a particular detector was pointing at 
a given time). In brief, a lot of work has already been done... 

We are then in a position to write down the “Observation matrix”. A, 
which is the process of “reading a map” x. The measurement system has a 
certain optical response, i.e. we can express the signal as being the convo- 
lution of the optical beam with the sky (times some overall normalisation 
by a transmission efficiency which can be absorbed as a calibration factor) . 

Here we shall restrict to the simple case when the beam is isotropic^^ . 
We can thus think of the measurement process as reading at every pointing 
a signal which is the sky convolved by the beam. In this case we simply 
have: 

• A is a Aj X Np “Observation operator”, which contains in each row 
a single 1 (corresponding to the pixel being pointed at) with all the 
other elements being zero; 

• A"^ is a Np X Nt “Projection” operator of the TOD on the map, with 
a single 1 per column, i.e. it sums all the measurement and attributes 
the result to the pixel toward which the detector was pointing; 

• a"”" A is a Np X Np (diagonal) matrix whose elements are hit counters 
for each map pixel. 

We aim at recovering a map convolved by the (isotropic) beam of the 
instrument. 

6.3.1 “COBE” map making 

Let us start with the no-prior solution which was actually used by the 
COBE team 



Wc=[A'^B-iA] ^A'^B~\ (6.20) 

which indeed leads an unbiased estimate of the underlying sky since 
y = Wc(Ax-hb) =x-hn, with n = [A'^B^^A] A'^B^^ b. (6.21) 



also need to have a final (map) signal to noise ratio of order one or less per pixel 
of size ~ the FWHM of the beam. In that case we can indeed use pixels of about that 
size, since the inaccuracies implied by the incompleteness of the basis or aliasing noise 
will be small as compared to the detector noise. If this is not the case, one must use 
smaller pixels and describe the beam convolution within A, as for asymmetrical beams... 
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Note that the noise in the recovered map, n, is independent of the unknown 
X, and its covariance matrix, N = (nn"'"), is simply 

N = [A'^B-iA]”\ (6.22) 

In case of a stationary white noise, each measurement is independent of the 
other, with the same variance of the noise, cr^, and the noise covariance 
matrix B is then simply times the Identity. In this simplistic case, we 
see that this scheme reduces to 

WwN = [A'^A] A"^, (6.23) 

i.e. one averages equally all the same measurements of a pixel (by simply 
adding up all measure of a pixel and dividing by the total number of hits). 
We further see that Nij = (number of hits of pixel i), i.e. the 

noise in each pixel is uncorrelated with the others, with a variance inversely 
proportional to the number of hits. 

If the noise is still stationary, but not white, the trick (see Wright [169]) 
is to apply a “pre-whitening” filter P to the TOD, y = Py, such that 
b = Pb has a white noise spectrum ^bb"^^ = PBP"^ = I (I standing for 

the identity matrix) . Thus the estimated map is then obtained by averaging 
the (pre-whitened) pixel measurements according to 

A^A ^ A^ y = [A'^P'^PA] A'^P^P y, (6.24) 

which is the same than equation (6.20), with P"^P = B~^. Further imple- 
mentation details and a slightly more general presentation can be found in 
Tegmark [152]. 

To be more concrete, suppose that the noise (Fourier) spectrum may 
be described as a white noise plateau with an 1// upturn toward the low 
frequencies below some threshold /knee, 

iHmfr) = 2At (^1 + , (6.25) 

where At is the sampling interval and a is the variance of the noise in this 
time interval, ignoring the 1// term. Then the Fourier transform of the 
filter, P(/) should simply be 




In the time domain, this filter has a spike at St = 0 (since P(/) — > 1 when 
/ ^ /knee), Other values being negative to insure a zero-sum filter (since 
P(/ = 0) = 0)- In effect, one removes an optimal baseline by a high-pass 
filter. 
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(a) (b) 



Fig. 35. (a) A map of the CMB anisotropies with 10 arcmin resolution from 
the MAXIMA- 1. The map is made using three 150 GHz and one 240 GHz pho- 
tometers. It contains 15 000 5' x 5' pixels, with an average signal-to-noise ratio 
larger than 2. (b) A wiener filtered version of the map shown in a), using their 
self-determined power spectrum to design the filter. Reprinted from [77]. 



6.3.2 Signal-to-noise (Wiener) filtering 

By using the result of equation (6.22), we can rewrite the no-prior scheme 
as Wc = NA*B"h In addition, we saw above that assuming a Gaussian 
theory leads to a map-making scheme in which there is an additional 
in the pixel weighting of the TOD projected onto the map, i.e. 

x^ = Wwy =[N-i-f (=Wcy) 

= C[C + N]-^ 5?5. 

This shows that Wiener filtering amounts to filter the no-prior map by 
C[C-|-N]~^ which is clearly a signal to noise weighting of the no-prior 
map. This was actually performed on the (real) COBE map by Bunn 
et al. [33,34]. Figure 35 gives an example of such Wiener filtering on the 
wonderful map derived by MAXIMA- 1. 

It is also interesting to note that S = ACA"^ is the signal covariance 
matrix of the TOD (S = ^Ax(Ax)"'" )). The alternative form of Wiener 
filtering shown in equation (6.16) can then be cast in the form 

Ww/y = = { [A^A] A^A} CA^ [ACA^ + B] y 
= [A^A]^^ A^ S[S-bB]-^ y. 
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We see that the same result may thus be obtained by first applying a signal 
to noise weighting on the timeline (S[S + B]~^ y), and then make the 
map as if the noise was white. Although formerly equivalent, these two 
approaches are not algorithmically equivalents^. 

Indeed, given the sizes of the matrices involved, we shall not in gen- 
eral actually perform these operations (in particular for Planck, with Np 
reaching ~ 50 million pixels at some frequencies), and one will rather find 
the approximations to the various solutions iteratively. This is still an area 
of research, in particular when one cannot assume that the beam is sym- 
metrical (the A matrix must account for the asymmetry, i.e. it does not 
simply contains 0 and 1 anymore, and A^'^A is not diagonal anymore^^). 
Still, the various development above give us the general framework within 
which algorithmic solutions are being sought. 

Note also that the covariance matrix of the signal we are searching, C is 
entering the definition of the filter. Of course, we do not know it perfectly, 
and we shall make some error. But since Wiener filtering simply amounts 
to convolving the noisy map with a larger (known) filter then the beam, an 
error will simply mean that the designed filter is not fully optimal. But this 
will not induce further errors in the analysis, since we know the filter we 
have applied, and this can then be fully accounted for in the following steps 
of the analysis. 

Other methods have been proposed but all linear method which de- 
rive from each other by multiplication of invertible matrices are of course 
information-theoretically equivalent, and Tegmark [153] showed that the 
above methods are lossless (as defined above, “lossless” means keeping con- 
stant the Fisher matrix; note though that the demonstration requires as- 
suming a Gaussian-distributed noise AND signal). 



6.4 Using redundancies 

The differences between two measurements of the same sky point at different 
times and different instrument configurations contains noise and systematic 
effects. 



^^Indeed, the Wiener filtering Wwy = C [C -I- N]~^ and Ww/ = S [S -L B]~^ y 
apply a signal-to-noise weighting either to (respectively) the no-prior map, or the timeline. 
If we want to do it fast, we would like the inverse weighting to be done in Fourier space, 
i.e, we would like to have either C -I- N or S -I- B to be stationary, while it is natural to 
only expect C and B to be stationary... 

^®As already mentioned in the footnote 11, in a high signal-to-noise per beam exper- 
iment (like Planck) one should use pixels for x smaller than the beam FWHM, even 
if it were symmetrical, in order that the pixelisation (or aliasing) noise due to the in- 
completeness of the representation be small on that scale as compared to the other noise 



sources... 
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Low frequency noise in the time-frequency domain containing no sky sig- 
nal can be very efficiently removed using redundancies. The low frequency 
noise contains 1// noise from the electronics but also effects of temperature 
variations of detectors or other instrument parts. These temperatures can 
be monitored with high accuracy thermometers and these low frequency 
signals can be modelled using a reference signal {i.e. a temperature as a 
function of time) and a transfer function with a few unknown parameters. 

Unknown systematic effects can be searched for by looking for patterns in 
the differences in signals measured on same sky points at different times and 
correlating them with various parameters that might affect the data. When 
found they are modelled as described above. A well identified systematic 
effect is the signal from the sky seen in the far side-lobes of the beam of 
each detector. The dominant contributions are expected to come from the 
CMB dipole, the Sun, Earth and Moon and the Galaxy. 

The procedure described in Sections 6.2 and 6.3 will optimise simultane- 
ously the sky map and all the systematic effects models taking into account 
the correlations of the residual noise. This will, in fact, minimises the differ- 
ences of the signals observed at the same sky points at different times and 
with different values of the relevant parameters for the systematic effects, 
so that the residual might indeed obey the assumed statistical properties of 
the noise. 

To illustrate the efficiency of such a procedure to remove low frequency 
noises and systematic effects, we discuss a few cases in the very simple as- 
sumption were they are the only spurious signal added to the sky signal. 
As they all rely on using the redundancies in the observations they cannot 
generally he removed in sequence. A global map-making solution will have 
to be built and applied to the whole set of data. The algorithms shortly 
described in the following sections have been built to estimate the accuracy 
with which systematic effects can be removed. They should not be consid- 
ered as elements of a processing pipeline to be applied sequentially to the 
data. 

The MAP mission has a sky scanning strategy similar to the COBE 
one. The measurements will rely on differences between the signals from 
two back to back telescopes observing at boresight angle of 70 deg with 
respect to the spin axis. The satellite spins at 0.4 RPM and this spin axis 
precesses with a one hour period around the antisolar direction. 30% of 
the sky is observed everyday and the ecliptic poles are observed throughout 
the mission. This provides an excellent connection between individual scans 
and redundancies with many time scales. 

The Planck mission detects the total power from detectors with a flat 
noise spectra down to a knee frequency ~ 10“^ Hz. The boresight angle of 
the optical axis with respect to the spin axis is 80 deg. The satellite spins 
at 1 RPM around its spin axis and the spin axis moves around the antisolar 
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direction with a period comparable to the orbital period at L2 (about 6 
months). This sky scanning strategy does not provide as good a connectivity 
between scans as the MAP one but the baffling of Planck and the work 
in total power gives other advantages especially for the identification of 
possible side lobe signals. 

In the Planck mission, the most obvious redundancies, for a given 
detector, are those induced by the spinning of the satellite with a fixed spin- 
axis between consecutive steps of the spin-axis motion. For one fixed spin- 
axis position, the expected signal from the sky is periodic, the period being 
the inverse of the spinning frequency, /spin- As noise and many systematic 
effects do not have that periodicity, most of the fluctuations they induce on 
the time streams can be filtered out simply by combining consecutive scans 
in one single ring of data for each spin-axis position. The combination of 
time streams into a set of rings is a non destructive compression of the useful 
sky signal by ~ two orders of magnitude (since ~ 50 scans are co-added in 
each ring). Of course, scan-synchronous systematics and noise cannot be 
suppressed at this stage. This first step in data compression, however, is 
easy to implement and very efficient in reducing the map-making problem. 
An additional advantage of performing this step is that the effect of any 
“filter” (due to electronics, time constant of bolometers...) on the signal 
can be quantified in an exact way on rings [46] . 

The next step in the map-making process is the identification and sub- 
traction of scan-synchronous spurious fluctuations by connections of the 
rings. The basic idea, again, is to compare the signal at intersections be- 
tween rings. There are a few thousands such rings per detector, of a few 
thousand data samples each^^. 



6.5 Low-frequency noise 

The first concern is the identification and removal of the effect of low- 
frequency noise which generates drifts in the signal. If, as expected, the 
knee frequency (as defined in Sect. 6.3) of the low-frequency noise is quite 
smaller than the spinning frequency of the satellite, the main effect of low- 
frequency drifts is to displace the average level of each ring by an unknown 
offset. A first-order correction amounts to estimate and readjust these off- 
sets. Without this step, a direct reprojection on the sky generates stripes 
in the maps, as shown on the left panel of Figure 36. 



^^For a one year Planck mission, with a spin axis kept in the antisolar direction, and 
moved by steps of 2.5 arcmin every hour, there are 5400 “rings” per detector with 11 200 
“samples” on each ring taken at 1/2.6 of a 5 ' FWHM beam. The same spatial sampling 
in the cross-scan direction will be achieved by using multiple (staggered) detectors at the 
same frequency. 
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Fig. 36. Maps of the sky before and after simple destriping. Note the change of 
scale between the maps. 

6.5.1 Simplest destriping 

For unpolarised measurements, one can assume to first order that the main 
beam signal is independent of the orientation. This allows for direct offset 
estimation by comparison of signal differences at intersection points between 
rings for each detector individually. A least-square minimisation algorithm 
has been implemented and tested for destriping Planck maps [44]. For the 
Planck HFI, the 1// noise from the electronics has a knee frequency below 
0.01 Hz. For such characteristics, this first-order method is very satisfactory, 
as illustrated by Figure 36. 

This method allows very efficient destriping even when rings intersect 
only at the ecliptic poles. If the knee frequency is somewhat lower than 
0.01 Hz but the crossing points between the rings are well spread over the 
rings striping can still be removed. The low frequency noise can be modelled 
by a limited Fourier series for each ring. The coefficients of this decompo- 
sition can be obtained by minimising differences at crossing points if these 
points are well spread along the rings. 

The method can be generalised when the signal depends not only on the 
direction in which the optical axis is pointed but also on the roll angle of 
the satellite around this direction. This is the case for polarisation signals. 
Revenu 2000 [136] has shown that combining the signals of the polarised 
detectors at the same frequency but with different direction of the polarisa- 
tion axis allows to eliminate the low frequency noise as efficiently as in the 
intensity signal. Figure 37 illustrates the repartition of intersection points 
for one such scan strategy. 

6.6 Contributions from emission in the far side-lobes of the beam 

The problem of the sidelobe contributions to the signal is central to the 
design of Planck, since the angular resolution depends drastically on the 
amount of spillover radiation that is acceptable. 
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Fig. 37. (a) Average number of intersections per degree along the scan, as a func- 
tion of angle on a Planck ring, for a scan strategy with 8 sinusoidal oscillations 
per year and 10° out-of-ecliptic amplitude, and a scan angle of 85°. Intersections 
are concentrated around the points of closest approach to the ecliptic poles (at 
angles of 0 and 180 degrees), but there are at least a few points per degree every- 
where along the scan, (b) Distribution of crossing points between circles on the 
sky for the same scan strategy. There is more redundancy around the poles, but 
with still some redundancy everywhere. For comparison, a 90° angle anti-solar 
scanning would yield crossing points at the North and South ecliptic poles only. 



The very low side lobe in the anti bore-sight direction provided by the 
main baffle is illustrated in Figure 38. It can also be seen in Figure 38 
that the side lobe signal varies systematically during the year. The signal 
is minimum when the central part of the galactic disc is behind the main 
baffle. When the same point of the sky at low ecliptic latitude is observed 
6 month apart, the sidelobe configuration is very different: the roll angle 
of the satellite around the optical axis is different by roughly 180 deg. The 
sidelobe signal can thus be identified as the difference between these two 
measurements. 

Even if the fluctuations due to sidelobe pickup are at a level lower than 
the sensitivity per 10 arcmin pixel (and thus are not too much of a worry 
for high £ value measurement of Cg), they will affect the outcome of the 
experiment at low spatial frequencies and need to be extracted. 

It has been shown that if the sidelobe signal is extracted from the mea- 
surements using the redundancy in the data, the galactic disc is a source 
compact enough for the side lobes to be deduced from a one year side lobe 
signal [45]. This is illustrated in Figure 39. This shows that the system is 
not degenerate, that an iterative solution can probably be found. 
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Fig. 38. The top panel displays the computed Planck side lobe at 30 GHz 
around the optical axis and around the opposite direction when the baffling is not 
included. The middle panel shows the effect of adding the baffling which leads to 
a higher contrast. The lower panel shows the side lobe signal in fiK during one 
year. The contrast between the low and the high period during the year is due 
to the position of the central region of the Galactic disc with respect to the well 
shielded direction. Reprinted from [42]. 
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Fig. 39. Far sidelobe recovery by deconvolution of the signal due to the emis- 
sion of the galactic dust in a model 350 GHz antenna pattern. This illustrates 
that redundancies are sufficient to use the galactic emission for sidelobe mapping. 
Actual sidelobe recovery, however, requires an iterative inversion of the data. 

7 Maps analysis methods 

7.1 Methods of component separation 

The joint analysis of single detector maps to create a “channel” map {i. e. at 
a given frequency) follows the same principles than the elementary 
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map-making above, although complication may arise in case of strong signal 
to noise and/or asymmetrical beams. Indeed to formalise the problem, one 
may arrange the various TOD (of length Nt) of the detectors probing 
the sky at the same frequency as a single vector y which will be of length 
NdNt, the map to create x will be the same, and the A will then be an 
NdNt X Np Observation matrix. This is exactly the same problem we dealt 
with earlier. 

We now focus on the next step, i.e. the joint analysis of a number Ni, 
of pixelised sky maps at different frequencies {i.e. channel maps) to ex- 
tract information on the different underlying Nc astrophysical components, 
assuming that calibrated (channel) maps of known angular resolution have 
been generated, with known noise covariance matrix (we assume here that 
all maps have the same number of pixels, Np). In this case, we may again 
write y = Ax -|- n, but now y will be a series of channel maps arranged as a 
vector of length N^Np, the component maps to be recovered will be a vector 
of length NcNp and A will stand for a “beam convolution and integration 
in the spectral band” operator. Note that A must describe the beam con- 
volution if the maps obtained at different frequencies have different angular 
resolution, which is the natural case (diffraction law at a given frequency 
for a given optics size). 

Let us in addition assume till the end of this section that we have applied 
a spherical harmonic transform to these equations (convolution transforming 
then into products) and that the experiment has full sky coverage. We then 
have 



y{i)=A{i)^{£)+h{i), (7.1) 

to be solved for each multipole £ independently (thanks to the complete sky 
coverage^^). This is just a convenient change of basis (modes instead of pix- 
els). Note that we used the same notations for all transformed quantities 
to lighten the presentation, since we shall always remain in the transformed 
space till the end of this chapter. One should simply take into account 
that we now deal with complex quantities, and transposes must be replaced 
by Hermitian conjugates (transpose of the complex conjugate). For com- 
pleteness, a step by step derivation of this form is reproduced from [27] in 
Appendix A. 

It may turn out that we have more templates to recover than there 
are frequency points, or even if we restrict ourselves to a smaller num- 
ber of templates, some templates like those of the synchrotron at large £ 
may produce such a small contribution to the overall signal that huge rel- 
ative variations may be easily accounted for by small noise variations, etc. 



^®In the more general case, modes are coupled, and one has to solve a much larger 
problem. 
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This problem did not explicitly show up at the map-making stage since then 
the problem was in fact vastly over-constrained (for a reasonable scanning 
strategy), not under-constrained. We then have the need for regularising 
the problem, i.e. specify what to do in case of degeneracies. This is exactly 
what ( “theoretical” ) priors do. 

In is important to understand that even the no-prior solution will in fact 
have a (hidden) prior in this case. Suppose we want to perform a minimi- 
sation and use a Singular Value Decomposition (hereafter SVD) to perform 
the minimisation. In this case, one implicitly selects the solution of minimal 
norm M = x'^x. In fact, most inversion methods can be formulated as the 
task of minimising the expression + cuM, where M is a measure of some 
desirable property of the solution, and a balances the trade-off between 
fitting best the noisy data (as measured by y^) and imposing a reasonable 
behaviour (as measured by M). We saw earlier two (explicit) type of reg- 
ularisation, by giving some form to P(T). Assuming a Gaussian theory 
lead to Wiener filtering which was originally derived in this multi-frequency 
multi-resolution context by [28] and [157] (by directly minimising the vari- 
ance of the reconstruction errors). Assuming an entropic prior leads to a 
Maximum entropy (MEM) reconstruction as shown in Hobson et al. [87]. 

7.2 Final map accuracy achievable 

Here we focus on (optimal) linear inversion since the properties of the solu- 
tion are readily computable by simple linear algebra. Appendix B gives a 
derivation of the formulae: 

(l^pP) = Qp (kpP) = QpCp{e) 

£p = (1 — Qp)Cp = 1/Qp ^ ] RppiCp' + WpijWpiji Bijjyi 

p'¥=p 

A A _ / ^ ^p 

^ \2i+lQp 

where Qp = Rpp stands for the trace elements of WA (W standing for the 
Wiener filtering method). Thus Qp tells us 

1. how the typical amplitude of the Wiener-estimated modes x are 
damped as compared to the real ones (Eq. (7.2)); 

2. the spectrum of the residual reconstruction error in every map 
(Eq. (7.3)); note that this error may be further broken down in resid- 
uals from each component in every map by using the full R matrix; 

3. the uncertainty added by the noise and the foreground removal 
(cx 1/Qp — 1, Eq. (7.4)) to cosmic (or sampling) variance which is 



(7.2) 

(7.3) 

(7.4) 
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given by + 1 (this result only holds though under the sim- 

plifying assumption of Gaussianity of all the sky components). 

This is these interesting properties which lead Bouchet and Gispert to sug- 
gest to use Qp as a “quality factor” to assess the ability of experimental 
set-ups to recover a given process p in the presence of other components; it 
assesses in particular how well the CMB itself can be disentangled from the 
foregrounds (and it can be generalised to the polarised case, see [30]). 

We now turn to visualisation of the implications of these formula in the 
specific case of the MAP and Planck experiments, using the sky model 
described earlier. 

Examples of Wiener matrixes Once the instrument, through the A 
matrix, and a covariance matrix C of the templates has been assumed, the 
Wiener filter is entirely determined through equation (6.16). Figure 40a 
offers a graphical presentation of the resulting values of the Wiener matrix 
coefficients of the GMB component when we use our sky model summarised 
in Section 4.3 to specify C (assuming negligible errors in designing the fil- 
ter) . It shows how the different frequency channels are weighted at different 
angular scales and thus how the ‘O — f” information gathered by the exper- 
iment is used. 

In the MAP case, most weight is given to the 90 GHz channel. It is 
the only one to gather GMB information at £ ^ 600, and it’s own weight 
becomes negligible at f ^ 1000. 

In the Planck case, the 143 GHz channel is dominant till £ ~ 1400 
and useful till £ ~ 2000, while the 217 GHz channel becomes dominant at 
£ ^ 1400 and gather GMB information till £ ~ 2300. The 100 GHz channel 
contribution (split in two on the graphics to show the relative HFI and 
LFI contributions) to the GMB determination is only modest, and peaks at 
£ ~ 800. This graph suggests that the impact of the LFI on the high £ HFI 
measurements {£ > 200) will be in controlling systematics and possibly 
in better determining the foregrounds. Of course, one should note that 
this analysis assumes the foregrounds are known well enough to design the 
optimal filter. Thus the LFI impact might be greater in a more realistic 
analysis when the filter is designed with the only help of the measurements 
themselves. 

Effective windows Figure 40b gives the effective Aspace window of the 
experiment for each component, Qp{£) (Eq. (7.2)), to be compared with 
the individual (optical) windows of each channel, Wi{£). For the GMB, it 
shows the gain obtained by combining channels through Wiener filtering. 
For MAP, the effective beam is nearly equal to it’s 90 GHz beam. As 
expected the Galactic foregrounds are poorly recovered except on the largest 
scales (small £) where their signal is strongest (we assumed spectra oc £~^). 
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Fig. 40. Top row: CMB Wiener matrix elements at f > 200 for MAP (left) 
and Planck (right). Each bin in I is the average over one hundred contignous 
values. The frequency bins are centred at the average frequency of each chan- 
nel, but their width in not representative of their spectral width. Middle row: 
corresponding square of the Uspace effective windows (i.e. Qp). As usual, black 
is for the CMB, red, blue, and green are for the Galactic components, yellow is 
for the SZ contribution. The transform of the channel optical beams are shown 
for comparison as dotted orange lines. Bottom row: CMB reconstruction error 
contributed by each component (with the same line coding than above) and their 
total in black. The integral of £{£ -|- l)e^/27r would give the reconstruction error 
of the map. The dashed-dotted line shows for comparison t'C(f')^^^ for the CMB. 
For Planck, the total of the reconstruction error is nearly constant at all i and 
corresponds to e{£) ^ 0.025 /rK, a factor more than six below the MAP case. 
Reprinted from [27]. 



F.R. Bouchet et al.: The Cosmic Microwave Background 



195 



The only exception is that of the H I-correlated component which is of course 
very well traced by the HFI of Planck. Note also the larger than two 
increase of effective angular resolution between MAP and Planck. 

Maps reconstruction errors We now estimate the spectrum of the re- 
construction errors in the map by using equation (7.3). Figure 40c compares 
these residual errors per individual £ mode with the typical amplitude of 
the true signal. The first thing to note is that the error spectrum cannot be 
accurately modelled as a simple scale-independent white noise (as expected 
from the — £ shape of the Wiener matrix for the CMB). It is interesting 
to note that in the Planck case the largest contribution but noise comes 
from SZ clusters in the range 50 ^ f ^ 500, to be then superseded by the 
leakage from the radio-sources background. One should also note that the 
10 /iK level is reached at t' ~ 200 for MAP and £ ~ 2000 for Planck. 

CMB power spectrum errors Finally, we can estimate the uncertainty 
added by the noise and foreground removal to the CMB power spectrum 
by using equation (7.4). Figure 41 shows the envelope of the l-cr error 
expected from the the current design of MAP (green), the LFI (blue), and 
the HFI or the full Planck (red), as well as a “boloBall” experiment meant 
to show what bolometers on balloon might achieve soon; the experimental 
characteristics used in this comparison may be found in Table 3. 

Note that there has been no “hand averaging” in this plot^^, which means 
that there is still cosmological signal to be extracted from the HFI at I' ~ 
2500 (a 10% band width around £ = 1000 spans 100 modes...). The message 
from this plot is thus excellent news since it tells us that even accounting for 
foregrounds, Planck will be able to probe the very weak tail of the power 
spectrum (£ > 2000) and allow breaking the near degeneracy between most 
of the cosmological parameters. 

The second panel of the figure shows the variations induced by changing 
the target CMB theory. In this case, a flat universe with a (normalised) 
cosmological constant of 0.7 was assumed. This shows that if the third peak 
is low, the measurement by MAP will be less accurate. This illustrates the 
fact that the errors on the signal will of course depend on the amplitude 
of the signal; it is a reminder of the fact that the numbers given so far 
are meant to be illustrative. Bouchet and Gispert [27] also looked at the 
impact of changing the foreground model and found very little impact for 
rather large variations of the assumed sky model, a result confirmed by an 
alternative Fisher matrix analysis [158]. 



^®The band averaging would reduce the error bars on the smoothed C{i) approximately 
by the square root of the number of multipoles in each band, if the modes are indeed 
independent, which is a reasonable approximation for full sky experiments. 




196 



The Primordial Universe 




(a) 




(b) 

Fig. 41. Expected errors on the amplitudes of each mode individually (no band 
averaging) for different experiments (the full Planck is indistinguishable from 
the HFI case). The thin central lines gives the target theory plus or minus the 
cosmic variance, for a coverage of 2/3 of the sky. a) Standard CDM b) Lambda 
CDM (with Ob = 0.05, Hcdm = 0.25, Ha = 0.7, h = 0.5). Reprinted from [27]. 



7.3 Numerical simulations 

The previous semi-analytical analyses used Wiener filtering to estimate the 
efficiency of the separation of Planck observations into physical 
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components. However, Wiener filtering is linearly optimal only for 
Gaussian processes. In addition it relies on a prior, the covariance matrix of 
the templates, that has to come from a first inversion whose precision needs 
to be assessed. For all these reasons, Bouchet and Gispert conducted for 
Planck [14,124] simulations of the sky observed with various experimental 
set-ups, that were analysed with this formalism, as well as with a nonlinear 
Maximum Entropy Method (MEM) [87, 88] . 



7.3.1 Simulations of the observations 

The simulations of the observations included the three Galactic components, 
the primordial CMB fluctuations, and the Sunyaev-Zeldovich effect from 
clusters (both thermal and kinetic). The only astrophysical source of fluc- 
tuations missing from the simulation shown (but see below) are those arising 
from the background of unresolved sources other than clusters, which are 
at very faint levels where the GMB contribution is strongest. The template 
maps have 12.5x12.5 square degree with 1.5' x 1.5' pixels. For the primary 
AT/T fluctuations, a GOBE-normalised standard GDM model was used. 
Realisations of the thermal and kinetic effects of clusters are stored respec- 
tively as maps of the y parameter and 5T /T fluctuations as before (see 
Fig. 19). To simulate Galactic emissions, templates extracted from IRAS 
and the 408 MHz survey of Haslam^^ and extrapolated at the required fre- 
quencies with the spectral model discussed earlier. 

The measurement process by the Planck instruments was simulated 
in the following way. Unit transmission across each spectral channel (with 
Av/v = 0.20 for the LFI and Avjv = 0.25 for the HFI) was assumed, 
and the spectra were integrated across each waveband for each spectral 
components. The angular response of each channel was assumed to be 
Gaussian (of the corresponding FWHM), and the sky maps were convolved 
with these beams. Finally, isotropic noise maps were added, assuming a 
spatial sampling at 1/2.4 of the beam FWHM^®. This clearly assumes that 



^^Since the Haslam map [80] has an angular resolution of only 0.85 deg, the authors 
added to them small scale structures with a C(t) oc l~^ power spectrum, thereby ex- 
trapolating to small scales the observed behaviour of the spatial spectrum. In practice, 
the map were Fourier transformed (which is equivalent to a spherical harmonics decom- 
position since the size of the map is much smaller than a radian) and their spectrum 
was computed. New harmonics were then generated at larger £, and globally normalised 
so that their spectrum smoothly extended the measured one. The results were then 
transformed back in real space. 
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the prior destriping procedure has been efficient enough for the residuals 
to be negligible). Figure 42 shows an example of such simulated maps as 
“observed” by Planck in a region with nearly median foregrounds. 

7.3.2 Analysing simulated observations 

A first inversion by minimisation (using an SVD decomposition) was 
made in order to deduce power spectra^® which were then used to deduce 
the Wiener filters to use to “polish” the component separation. 

Figure 43 shows a comparison between the zero-mean input template Ti 
of the simulated maps of Figure 42, and that recovered, To, after Wiener 
analysis of the observations. The accuracy of this inversion may be judged 
with the help of Figure 44 (which actually corresponds to the analyses of 
the HFI alone): 

• Panel a) shows the reconstruction errors per 1.5' pixel, s{x) = Ti(x) — 
To(x), which have a Gaussian distribution to a high degree of accu- 
racy^°, with an rms, a, which is 15% of the rms of the input maps (z.e. 

/T ~ 4 X 10“®, with a quite weak skewness S = {{e — e)^) l<J^ = 
—0.007, and a quasi Gaussian kurtosis, Tf = ((e — e)"^) = 3.0); 

• The contours of panel b) show a tight correlation between the input 
and recovered values, with no bias, no outliers, and an even distribu- 
tion of the recovered values for all input values; 

• Panel (c) compare the spectra of the input, output and difference. 
The difference spectrum reaches 5 /iK at ^ ~ 800, instead of 1200 as 
suggested by the theory. It is only for scales t > 2000 that the signal 
becomes too weak to be fully recovered. Similar numbers are obtained 
for the analysis of full Planck observations. Note that these numbers 
correspond to the analysis of only one map. The residual errors on 
the spectrum determination should thus be much smaller once a large 
fraction of the sky has been analysed. 



^®The noise rms of the simulated maps are thus ~ 2.4 times greater than the numbers 
recalled in the performance Table 3. These are reobtained if one degrades the noise maps 
resolution to the FWHM of the corresponding channel. 

^®This was done in Fourier space, mode by mode. From the modes obtained, the 
spectra were deduced by removing the noise bias (this inversion noise being estimated 
by the algorithm itself) and averaging over angles. A fit was then performed for each 
component. The CMB was fitted by an power law times an exponential cut-off, the 
SZ part by an l~^ power law, while all the Galactic components (both for their auto- 
and cross-correlations) were fitted with power laws. 

^^This should simplify both the analysis of parameter estimation, and the quantitative 
appraisal of the Gaussianity (or not) of the primary signal, via for instance measurements 
of the bispectrum, see [82]. 
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Power Spectrum of the thermodynamic AT 
fiving the predicted Power-in-band at 09 GHz. fwhm • 10.70* 




Fig. 42. Simulated maps of 9 degree diameter, as observed by Planck. The top 
row shows the first three channels of the LFI instrument (it’s 100 GHz map is not 
shown). The next 2 rows correspond to the 6 channels of the HFI instrument). 
The rms level of the foregrounds corresponds approximately to their median value. 
The figure in the last row shows the power spectra of the processes contributing 
to the 100 GHz “observation” above. 
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Fig. 43. An example of component separation of the Planck “observations” of 
Figure 42. a) template map of the AT anisotropies used in the simulation (note 
the similarity with the 143 GHz observations by the HFI). b) recovered map by 
Wiener filtering. 




Fig. 44. Accuracy of the reconstruction of the AT map of Figure 43 by Wiener 
filtering of HFI observations, (a) histogram of the reconstruction errors. It is well 
fitted by a Gaussian with cr = 15% of the input map rms of 77.6 /iK (red line), 
(b) Gontours in the plane of recovered values (per pixel) versus input values. The 
inner contour already contains ~ 90% of the pixels values, while the outer one 
contains all pixels, (c) Gomparisons between (band averaged) power spectra of 
the input (black), output (red), and difference map (blue). 



It was found that inversions by MEM lead to very similar quantitative 
conclusions as far as the CMB is concerned (which is dominant and assumed 
Gaussian) . The situation is different though for the recovery of the SZ effect 
from clusters. MEM then does better than Wiener filtering for this tenuous 
but strongly non-Gaussian signal. Figure 45 shows that even in this small 
region of 9° diameter with median foregrounds, one easily detects at least ten 
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Fig. 45. An example of separation using MEM of the component from the 
“observations” of Figure 42. The left panel shows the template map used in 
the simulation, after convolution with a 10' beam, the middle panel shows the 
recovered map by MEM applied to Planck observations. 



clusters (the strongest cluster of this image has a central y value of 5 x 10“® 
at a 10' resolution, but weaker ones with a central value ~ 2 x 10“® are 
easily recovered^^). The Planck catalogue of SZ clusters should then have 
~ 10000 entries (although this number is quite dependent on the cosmology 
selected) . This opens very exciting prospects for constraining the gas history 
at high z and for cross-correlating such maps with galaxy catalogs and X-ray 
and weak lensing maps. 

Of course, one could anticipate that MEM would be good too for detect- 
ing point sources. And indeed, when the simulations above were sprinkled 
with point sources according to Toffolatti et al. model [160], MEM did very 
well at extracting them as can be judged from Figure 16 which further 
suggest^^ a detection threshold of ~ 70 mJy at 44 and 353 GHz. 

The component separation problem is still an active area of research, and 
many groups investigate alternative ways to improve on one aspect or the 
other (speed, better sensitivity to certain types of non-Gaussian structure, 
different priors, etc.). In addition, no one to our knowledge has yet simulated 
numerically a polarised components analysis. 



These values are about 1.5 times larger at a resolution twice better. 

^^One may fear that Toffolatti’s et al. model [160], while quite accurate at low fre- 
quencies, u < 100 GHz, might underestimate the confusion limit at high frequencies, 
u > 100 GHz. The detection threshold might actually be worse at 353 GHz for other 
distributions of galaxies, like those in the models of Guiderdoni et al. [72] . 
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7.4 Joining ends 

Once a map is obtained, one still needs to see which constraints it sets on 
physics, i.e. quantitative analyses are needed. In the Gaussian case, all the 
map information may be compressed in the form of a two-point statistics, 
like the two-point correlation function defined in (3.3), or the more widely 
used power spectrum (which is its spherical Harmonic transform). 



7.4.1 Power spectrum estimation 

We assume that both the signal and the noise obey Gaussian statistics at 
the end of the component separation; the power spectrum, Ci may then be 
obtained, given the map x, by maximising the likelihood function^^ 



i(x|Q) 



exp(-ixTX-ix) 

(27r)^p/2(detX)i/2’ 



(7.5) 



where X(C^) stands for the covariance matrix of the data. This is what was 
done to analyse the COBE data [21,35,66-68,85,151,155]. 

Although written as a simple Gaussian distribution in the data, this is a 
complicated non-linear function of the power spectrum which enters through 
X. Indeed, if we work in the pixel basis, then with previous notations 
X = C -|- N, the signal covariance between pixel i and j being given by 
equation (3.3), 



27-1-1 

Cy = e(%) = E (cos(%)) , (7.6) 

while the noise covariance N is a result of previous steps of the analysis. 
As opposed to the map-making and component separation cases (where the 
data were linearly related to the theory by y = Ax + b), there is no closed 
form solution to this equation and we shall resort to numerical estimation. 

In order to minimise C = —In A, we want to find roots of it’s first 
derivative 



^ = -Vx-P^X-x+ltr(X-pO, 



(7.7) 



which follows from equation (7.5), with (in the pixel basis, Eq. (7.6)) 



5X _ (27+l)p , „ , 

^ - dCi~ 47t 



(7.8) 



^^This implicitly assumes a uniform prior so that P(C£|x) oc L(x|C^). 
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A Taylor expansion in the neighbourhood of a trial value, Ci, yields 

{Ce-C'i), (7.9) 



d£ 


dC 


d^C 

1 


dCi 


a ~ dCt 


^ dCidCt 



Ci 

which gives us a mean to locate iteratively these roots. It involves the 
second order derivative given by 

= x'^X-ip^X^ip^X-^x - itr (X-ip^X'ip^), (7.10) 

whose expectation value is the Fisher matrix 



/ d^C \ 1 

“ \dCidCi, / 2 ^ 



) = ^tr (X-ip^X-ip^). 



(7.11) 



If the trial solution is close to the root, we can approximate the second order 
derivative by it’s expectation value^^ and one needs to solve 



dC 



dCe 



Ci 



dC 



_ + Y,Fu'{Ci-Ce) = 0. 

Ci l! 



(7.12) 



This can be done iteratively according to 



Cf, — 




dC 



^i' 



(7.13) 



which is the Newton-Raphson method for solving non-linear systems of 
equations. Note that this formula involves a quadratic form of the map 
vector, which is why this is referred to as a quadratic estimator of the power 
spectrum. It was derived by Bond et al. [24,118], and an alternative deriva- 
tion was given by Tegmark [153]. One may further consult Bond et al. [23] 
for a recent and in-depth review of the computing challenges of CMB 
analysis. 

Note that the brute force approach applied to COBE is already not ap- 
plicable to current data sets like BOOMERanG and MAXIMA, or to future 
data sets like those of MAP and Planck. For Planck with Np ~ 10^ 
pixels, this would require ~ 1600 TeraBytes of storage and ~ 25 000 years 
of a 400 MHz processor (each evaluation of the likelihood would naively 
require 0{Np) operations for the required matrix operations and the de- 
termination of the determinant!). Active researches are ongoing to exploit 
various (possibly approximate) symmetries and transformed forms of the 



course, this may change slightly the rate of convergence, but not the derived value 
once they are converged. 
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Fig. 46. Recovered power spectrum by Oh et al. (1999) from a single simulated 
1024 X 2048 pixel map. The grey band indicates the one sigma uncertainty given by 
The light dashed line indicates the starting guess. Reprinted from [118]. 



problem. Indeed, Oh et al. [118] recently achieved in the MAP case a re- 
duction from 0{Np) storage and 0{N^) operations to 0{Np^^) storage and 
0{Np) operation with negligible loss of accuracy. 

Figure 46 shows a recovered power spectrum with the method above 
applied to simulated MAP observations [118] . Figure 47a demonstrates that 
the input power spectrum can be very well recovered under the assumption 
that it is a smooth function of £. This (spline) fit was used in the Fisher 
matrix calculation. 



7.4.2 Constraints on models 

One possibility to constrain theoretical models would be to write that the 
power spectrum Ce{p) depends on a vector of model parameters, p, and 
explicit the parameter dependence of the likelihood of the map in equa- 
tion (7.5), via X(p). One could then attempt a direct measurement of the 
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(a) (b) 

Fig. 47. (a) The result of a smoothing spline fit to the points in Figure 46. The 
dark solid line is the spline fit, the light solid line is the input power spectrum. The 
grey band indicates the 68% confidence band obtained from bootstrap simulations. 
Including the prior expectation that the underlying power spectrum is smooth 
allows one to obtain a very firm handle on the power spectrum, (b) The solid line 
indicates the input power spectrum, while the dashed line indicates the best-fit 
cosmological model from the data. The two are virtually identical. Reprinted 
from [118]. 



parameters from the map with the analog of equation (7.13) 



(n-K) (n) 

Pk =Pk 



-Y.^kk' 



dC 



dpk' 



,(») 



(7.14) 



the derivatives of the likelihood following by the chain rule for derivatives, 
as well as the Fisher matrix for the parameters 



Ffcfc' 




(7.15) 



In practice though, we will anyhow compute the power spectrum Ce and the 
inverse of its error covariance matrix, Fui, since this is the most compressed 
form of the experiment with no theoretical prior; anyone with a new the- 
oretical model can use it to constrain his favourite model parameters even 
long after the power spectrum has been released (see below the discussion 
on hidden priors) . We thus turn to the model fitting of the estimated power 
spectrum, Ci. 
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The principle is again very simple. One lays out a grid of models and 
compute for each its likelihood given the data 



L{Ce\p) oc 



exp-^{Ci{p) - Ci)'^Fu'{Ct{p) 
detF-i 




(7.16) 



where it has been assumed a Gaussian approximation to the likelihood near 
it’s maximum. Despite the huge compression achieved so far (from TOD to 
Ci), this is still a non-trivial numerical task. Indeed in the CMB context, 
we need minimally 11 parameters for high precision experiments, i.e. a grid 
in 11 dimensions. Even with only 10 models for each parameter variation, 
that still leaves 10^^ models to estimate (and certainly much more since 
all will want to “zoom in” around the most likely). Note that the model 
must also account for effects with a Planckian signature {e.g. lensing, or 
reionisation, or anything else which may alter that primordial anisotropy 
spectrum) . 

On the other hand, as noted by Tegmark [153] and Oh et al. [118], 
this is probably unnecessary. The information used by the maximum likeli- 
hood method is two-fold: the parameter dependence of the power spectrum, 
Ce{p), and the covariance matrix of the error or the Fisher matrix, which 
also depends on the parameter. But if the constraints on the spectrum are 
tight, the Fisher matrix is well determined (see Fig. 47a) and barely depends 
at all on the parameters fitted for; it may thus be taken constant. One is 
then left with a much simpler fit to the power spectrum 



X"(P) = - CefFu'{Cp{p) - Ci,). (7.17) 



The soundness of the approximation (F ~ constant) may be a posteriori 
verified by comparing the derived best fit with the smooth fit {e.g. via 
splines) to the spectrum^^ which was used to estimate the Fisher matrix in 
the first place. A specific example is show in Figure 47b. 

It is important to remember that even without explicitly adding priors 
(like a bound on acceptable values of the Hubble constant from other astro- 
physical measurements, or a bound on baryonic abundance from primordial 
nucleosynthesis), there are always the “hidden” priors in this parameter fit- 
ting. Indeed the grid of models used will unavoidably correspond to some 
particular class of theories, e.g. adiabatic Gaussian fluctuations generated 
during a generic inflationary phase, as long as we lack an established grand 
unified theory of particle physics to fix the framework... In the case above, 
one hidden prior is that there are no isocurvature modes. Allowing a non- 
zero fraction of isocurvature modes, for instance generated by defects, will 
lead to different sets of constraints [29,38]. 



^®This involves the prior that the derived spectrum should be smooth, as expected in 
all current theories. 
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Finally, even if the CMB anisotropies are Gaussian in the first place, 
this will require a demonstration! In practice, one shall apply a battery 
of tests to constrain how likely the results are to be derived from Gaussian 
statistics in each case. Of course, this is assuming that all extraneous sources 
of non-Gaussianity (systematics, residuals from the components separation, 
weak signals from Thomson scattering, etc.) have been removed from the 
analysed data. And maybe the primary anisotropies will turn out to have 
some non-Gaussian features after all! Here again this is an area where much 
progress is anticipated/needed in the coming years. 

8 Conclusions 

The field of CMB observations is fast changing in many respects. Second 
generation experiments begin to give spectacular results and will keep doing 
so in the coming years. These experiments measure the CMB at the best 
frequencies where foregrounds are minimal. In fact, their sensitivity will 
not require more than a rough subtraction of the dominant foreground. 

High accuracy measurements of the intensity and of the polarisation will 
require the observation and removal of all the foregrounds. This, in turn, 
will be possible only if the physics of these emissions is understood well 
enough. For extragalactic point sources and galactic dust emission many 
questions are still open and will demand work before the type of foreground 
subtraction needed for Planck can be done. We can quote the redshift 
evolution of infrared galaxies or the galactic dust polarisation as examples 
of such questions. 

The correlation between SZ observations identifying systematically clus- 
ters of galaxies and lensing effects is another emerging topic which, with 
many others^®, was not touched upon in these lecture notes. 

The present generation of experiments is confronted with the questions 
of removal of systematic effects and of optimal data analysis with data sets 
already much larger than the GOBE one. Rapid progress is being made in 
the context of these experiments. 

In conclusion, in these lecture notes, the authors have tried to catch a 
picture of a quickly moving target and show some of the directions of future 
work. Many others will undoubtly emerge in the near future. 
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Appendix 



A Formulating the component separation problem 

In order to make these notes as self-contained as possible, we reproduce 
here the formulation of the component separation problem derived step by 
step by Bouchet and Gispert in [27]. The next appendix can also be read 
in sequence. 

A.l Physical model 

We note T{v,e) the flux at a frequency v in the direction on the sphere 
referenced by the unit vector e. We make the hypothesis that T{v,e) is 
a linear superposition of the contributions from components, each of 
which can be factorised as a spatial template Tp(e) at a reference frequency 
{e.g. at 100 GHz), times a spectral coefficient gp(i') 

No 

•^(ue) = ^ffp(:^)Tp(e). (A.l) 

p=i 

This factorisation is not stricto sensu applicable to the background due 
to unresolved point sources, since at different frequencies different redshift 
ranges dominate, and thus different sources. But even at the Planck-HFI 
resolution, this is, as we showed earlier, a weak effect which can be safely 
ignored for our present purposes. 

Let us also note that the factorisation assumption above does not re- 
strict the analysis to components with a spatially constant spectral be- 
haviour. Indeed, these variations are expected to be small, and can thus 
be linearised too. For instance, in the case of a varying spectral index 
whose contribution can be modelled as jz“(®)7),(e), we would decompose it 
as i'°'[l + {a{e) — a) In v\Tp{e). We would thus have two spatial templates to 
recover, Tp{e) and (o;(e) — a)Tp{e) with different spectral behaviours, oc 
and oc respectively. But given the low expected level of the high 

latitude synchrotron emission, this is unlikely to be necessary. This simple 
trick may still be of some use though to describe complex dust properties in 
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regions with larger optical depth than assumed here (see also the alternative 
way proposed by Tegmark [154]). 

A. 2 The separation problem 

During an experiment, the microwave sky is scanned by arrays of detectors 
sensitive to various frequencies and with various optical responses and noise 
properties. Their response is transformed in many ways by the ensuing 
electronics chains, by interactions between components of the experiments, 
the transmission to the ground, etc. The resulting sets of time-ordered 
data (TOD) needs then to be heavily massaged to produce well-calibrated 
pixelised maps at a number of frequencies, each with an effective resolution, 
and well specified noise properties. We assumed here that all this work has 
already been done (for further details, see [44,46,93,153,169]). 

We thus focus on the next step, i.e. the joint analysis of noisy pix- 
elised sky maps, or signal maps Sj, at different frequencies in order to 
extract information on the different underlying physical components. It is 
convenient to represent a pixelised map as a vector containing the Np map 
pixels values, the j-th pixel corresponding to the direction on the sky ej. 
Let us denote by tp the unknown pixelised maps of the physical templates 
Tp. We thus have to find the best estimates of the templates tp{ej), which 
are related to the observed signal by the x iVp equations 

No ooo 

= / diy Vi{i^ - i^i) gp{i^)'^Wt{ej 

p=l 

No 

= [wi-ktp{e)] + 

p=i 

where ★ stands for the convolution operation. This system is to be solved 
assuming we know for each signal map its effective spectral and optical 
transmission, Vi{i> — Vi) and Wi{e) and that the covariance matrix = 
of the (unknown) noise component has been properly estimated 
during the first round of analysis. 

Since the searched for templates tp are convolved with the beam re- 
sponses, Wi, it is in fact more convenient to formulate the problem in terms 
of spherical harmonics transforms (denoted by an over-brace), in which 
case the convolutions reduce to products. For each mode £ = {£,m}, we 
can arrange the data concerning the channels as a complex vector of 
observations, y{£) = {^~^{£),^'^{£), . . . ,'s^{£)}, and we define similarly 
a complex noise vector, b(£) = {''^^{£), {£),.. . ,n^(£)}, while the 
corresponding template data will be arranged in a complex vector x(£) of 



- Gfc) tp(efe) +m{ej) 



(A.2) 

(A.3) 
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length Nc- We thus have to solve for x, for each mode t independently, the 
matrix equation 



y(£)=A(^)x(€)+b(£), (A.4) 

with Aip{t) = [vi*gp{v)\ Wi{£). Note though the independence of the £ 
modes would be lost in case one uses other functions that spherical har- 
monics {e.g. to insure orthogonality of the basis functions when all the sky 
is not available for the analysis [66]). Then instead of solving Np systems 
of Ni, equations, one will have to tract a much larger system of Np x N^, 
equations (possibly in real space rather than in the transformed space). For 
simplicity we treat each mode separately in the following. 

B Error forecasts assuming Wiener filtering 

We also follow in that Appendix the presentation of Bouchet and 
Gispert [27] to show that many properties of the end result of Wiener fil- 
tering of maps observed at different frequencies can be simply computed 
by using what they call the Quality factor of the extraction of a given 
astrophysical component, given an experimental set-up, assuming Wiener 
filtering is used to analyse a particular sky model. 

B.l Reconstruction errors of linear component separations 

Any linear recovery procedure may be written as an Nc x N^ matrix W (£) 
which applied to the y{£) vector yields an estimate x(€) of the x(£) vector, 
i.e. 



Si{£)=W{£)y{£). (B.l) 

In the following, we show that the matrix R = WA plays in important 
role in determining the properties of the estimated “theory” by the selected 
method, whatever it might be. Indeed, since x = W [Ax + b], we have 

(x) = R (x) and = R (xx^) R^ + WBW^ (B.2) 

where x^ is the Hermitian conjugate of x (i.e. x^ is the complex conju- 

gate of the transpose, a;"”", of the vector x), and B = (bb^) stands for the 
(supposedly known) noise covariance matrix. We denote by {Y) the average 
over an ensemble of statistical realisations of the quantity Y. The covariance 
matrix of the reconstruction errors, E, is given by 

E = ((x - X)(x - X)1') = C - RC - CR^ -k (xx^') , (B.3) 

where C = (xx^) stands for the covariance matrix of the underlying 

templates. 
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An estimate of this covariance matrix C should of course be estimated 
from the data itself. Since the elements of the covariance matrix of the 
recovered templates (assuming they have zero mean) may be written as 

(xpX*) = Rpp {xpxD ’ (B-4) 

r^p s^q 

we define the intermediate variable aip = Xp/ Rpp in order to write 

{xpxD = {xpX*q) + B, (B.5) 

where B is then an additive bias. This shows that an unbiased estimate of 
the covariance matrix of the unknown templates, C, may be obtained by 

c = (>^) = ^ _ B. (B.6) 

m——i 



This expression is only formal though, since B contains some of the un- 
knowns. Still, this form allows to generalise the calculation of [100] of the 
expected error (covariance) of the power spectra estimates derived in such 
a way from the maps. Indeed, 



Cov 







2 £- 






-a 



pq- 



(B.7) 

(B.8) 



Once this expression is developed, there are many terms with fourth mo- 
ments of the form (xr(^, rn) Xg{(-, m)* Xt{£, m') x■a{^^ rn')). 

If we assume that all the Xp are Gaussian variables, then we can express 
these fourth moments as products of second moments, and recontract the 
sums to finally obtain 



Cov 




(Cpq + Bpq) 2 ^ -(- 1 ^^PP ^pp)i^qq + ^qq) 

+ 2 £ _|_ 2 ^ i^Pq + ^pq) ~ 2^pq((C'pg -|- Bpq) + Bpq 



2I+l\ 


{Cpq J3pq 


1 


/ Xp ^q ' 


21+1 


\ Rpp Rqq ! 



ypp 



Jpp) \^qq 



■’qq) 



XpXf, \ /Xq 



^PP 






(B.9) 



(B.IO) 

(B.ll) 

(B.12) 



This expression shows the extra contribution of noise and foregrounds to 
the cosmic variance. It can be re-expressed and simplified once a specific 
inversion method has been specified. 
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B.2 Specific case of Wiener filtering 

The definition (6.15) of Wiener filtering implies that 

(xx^) = (xxt) = (xx1') • (B.13) 

As a result, the expression for the covariance matrix of the reconstruction 
errors E = ( ee^) reduces to the simple form 

E = [I - R] C, (B.14) 

where I stands for the identity matrix, and R = WA as before. Alterna- 
tively, E = C [I - Rt] . 

The general expression of the covariance matrix of the templates given 
by equation (B.2) may be simplified by using the specific Wiener matrix 
properties of equations. (B.13) and (B.14), which yields 

(xx^) = R(xx^) • (B.15) 

Both equations (B.14) and (B.15) clearly show that the closer R is to the 

identity, the smaller are the reconstruction errors. 

The diagonal elements of E are of particular interest since they give the 
residual errors on each recovered template, 

£p = Epp = (1 - Rpp)Cpp. (B.16) 

To break down this overall error into its various contributions, we start from 

£2(£) = (|x- W(Ax + b)|2) (B.17) 

= iyVpjyAjyp' ^^pp^)(Wpi/' i^^p'^p"') 

+ Wp,Wp,,{b^bl,) (B.18) 

which may be written as 

= (1 — Rpp)^Cpp + Rpp' Rpp"Cpipii + WpiiWpi,! Byi,i . (B.19) 

p'^p=jLp" 

By using equation (B.16), we get 

£p = — — 'y ( Rpp' Rpp"Cp'p>> + Wpi^Wpi^' , (B.20) 

p'^p^p" 

which describes the power leakage from the other processes and the co-added 
noises. 

In addition, the error on the deduced power spectra given by equa- 
tion (B.12) takes for Wiener filtering the simple form 
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which will prove particularly convenient to compare experiments. 

In this theoretical analysis, we can always assume for simplicity that 
we have decomposed the sky flux into a superposition of emissions from 
uncorrelated templates, so that 

(xpx;,) = 5pp, Cpp{£) = 5pp, Cp(£). (B.22) 

In the uncorrelated case, the previous expressions (B.15, B.16, and B.20, 
B.21) may be written as 

(l^pP) = Qp (kpP) = QpCp{£) 

£p = (1 ~ Qp)Cp = £/Qp RppiCp' + ITpi/Wpi/'Rjyi/' 

p'=jtp 

where Qp = Rpp stands for the trace elements of WA. Thus Qp tells us 

1. how the typical amplitude of the Wiener-estimated modes x are 
damped as compared to the real ones (Eq. (B.23)); 

2. the spectrum of the residual reconstruction error in every map 
(Eq. (B.24)); note that this error may be further broken down in resid- 
uals from each component in every map by using the full 
R matrix; 

3. the uncertainty added by the noise and the foreground removal 
(oc 1/Qp — 1, Eq. (B.25)) to cosmic (or sampling) variance which 
is given by Cppj\j2£ -|- 1 (this result only holds though under the sim- 
plifying assumption of Gaussianity of all the sky components). 

Given these interesting properties, Bouchet and Gispert suggested to use Qp 
as a “quality factor” to assess the ability of experimental set-ups to recover a 
given process p in the presence of other components; it assesses in particular 
how well the GMB itself can be disentangled from the foregrounds. 

This “quality” indicator generalises the real space “Foreground 
Degradation Factor” introduced by [49]. It may be viewed as an exten- 
sion of the usual window functions used to describe an experimental setup. 
This can be seen most easily by considering a noiseless thought experiment 
mapping directly the GMB anisotropies with a symmetrical beam profile 
w{9). Then the power spectrum of the map will be the real power spec- 
trum times the square of the spherical harmonic transform of the beam 
{xp{i)) = w{£Y {xp{£)) . The spherical transform of Q^^(^) is then the 
beam profile of a thought experiment directly measuring GMB anisotropies. 



(B.23) 

(B.24) 

(B.25) 
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The shape of Qp{i) thus gives us a direct insight on the real angular reso- 
lution of an experiment when foregrounds are taken into account. 

In addition, one often considers {e.g. for theoretical studies of the ac- 
curacy of parameter estimation from power spectra) a somewhat less ide- 
alised experiment which still maps directly the CMB, with a beam profile 
w{9), but including also detector noise, characterised by its power spectrum 
Cn- Let us suppose this is analysed by Wiener filtering. We have to solve 
y = Ax + n, with 



= {w,w,w,w, . . .}, (xx^) = Cx, B = Cn X I. 

The previous formulae then lead^^ to 



QxW 

Ad^ 



Cx{i) 

C^{£)+w-^Ci,{e)/N, 
2 C,{£) ^ 2 

2£+l 2g+l 



{c,{£) + w{e)-^c^{e)/N,) , 



(B.26) 



(B.27) 

(B.28) 



where Nc stands as before for the number of channels (measurements). 
Equation (B.27) tells us that the Wiener-estimated modes (as seen by the 
ratio of the estimated to real spectra) are damped by the ratio of the ex- 
pected signal to signal -I- noise (the noise power spectrum being “on the 
sky”, cf. Section 4.3.2, with the noise power spectrum being the noise 
power spectrum per channel divided by the number of channels). In brief, 
what Wiener filtering does when the noise strength is getting larger than 
the signal is to progressively set the estimate of the signal to zero in an 
attempt to return only true features. And of course we recover the tradi- 
tional error on the estimated power spectrum added by noise found by [100]. 
Thus determining the quality factor allows to find the equivalent ideal ex- 
periment often considered by theorists; it provides a direct estimate of the 
final errors on the power spectrum determination from a foreground model 
and the summary table of performance of an experiment. 

Since this is rather convenient, a generalisation of this approach to the 
case of polarisation measurements was developed by [30] and used on a prior 
estimate of the level of dust polarisation developed by [127]. 
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INTRODUCTION TO SUPERSYMMETRY: 
ASTROPHYSICAL AND PHENOMENOLOGICAL 
CONSTRAINTS 



K.A. Olive 



Abstract 

These lectures contain an introduction to supersymmetric theories 
and the minimal supersymmetric standard model. Phenomenological 
and cosmological consequences of supersymmetry are also discussed. 

1 Introduction 

It is often called the last great symmetry of nature. Rarely has so much 
effort, both theoretical and experimental, been spent to understand and 
discover a symmetry of nature, which up to the present time lacks concrete 
evidence. Hopefully, in these lectures, where I will give a pedagogical de- 
scription of supersymmetric theories, it will become clear why there is so 
much excitement concerning supersymmetry’s role in nature. 

After some preliminary background on the standard electroweak model, 
and some motivation for supersymmetry, I will introduce the notion of su- 
persymmetric charges and the supersymmetric transformation laws. The 
second lecture will present the simplest supersymmetric model (the non- 
interacting massless Wess-Zumino model) and develop the properties of 
chiral superfields, auxiliary fields, the superpotential, gauge multiplets and 
interactions. The next two lectures focus on the minimal supersymmetric 
standard model (MSSM) and its constrained version which is motivated by 
supergravity. The last two lectures will look primarily at the cosmological 
and phenomenological consequences of supersymmetry. 

1.1 Some preliminaries 

Why Supersymmetry? If for no other reason, it would be nice to understand 
the origin of the fundamental difference between the two classes of particles 
distinguished by their spin, fermions and bosons. If such a symmetry exists, 

© EDP Sciences, Springer- Verlag 2000 
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one might expect that it is represented by an operator which relates the two 
classes of particles. For example, 

QjBoson) = jFermion) 

QjFermion) = |Boson)- (1-1) 

As such, one could claim a raison d’etre for fundamental scalars in nature. 
Aside from the Higgs boson (which remains to be discovered), there are no 
fundamental scalars known to exist. A symmetry as potentially powerful 
as that in equation (1.1) is reason enough for its study. However, without 
a connection to experiment, supersymmetry would remain a mathematical 
curiosity and a subject of a very theoretical nature as indeed it stood from 
its initial description in the early 1970’s [1,2] until its incorporation into a 
realistic theory of physics at the electroweak scale. 

One of the first break-throughs came with the realization that super- 
symmetry could help resolve the difficult problem of mass hierarchies [3], 
namely the stability of the electroweak scale with respect to radiative cor- 
rections. With precision experiments at the electroweak scale, it has also 
become apparent that grand unification is not possible in the absence of 
supersymmetry [4]. These issues will be discussed in more detail below. 

Because one of our main goals is to discuss the MSSM, it will be useful 
to first describe some key features of the standard model if for no other 
reason than to establish the notation used below. The standard model is 
described by the SU{3)c x S'C/(2)l x U(1)y gauge group. For the most part, 
however, I will restrict the present discussion to the electroweak sector. The 
Lagrangian for the gauge sector of the theory can be written as 

£g = (1.2) 

where = d^Wl — d^W^+ge’FwI^Wl^ is the field strength for the SU(2) 
gauge boson IF^, and Ff^i, = is the field strength for the U{1) 

gauge boson The fermion kinetic terms are included in 

= - E * (1.3) 

/ 

where the gauge covariant derivative is given by 

Df, = df, - i g Y - i g' j Bf,. (1.4) 

The (Ti are the Pauli matrices (representations of SU{2)) and Y is the hy- 
percharge. g and g' are the S'[/(2 )l and U{l)y gauge couplings respectively. 

Fermion mass terms are generated through the coupling of the left- and 
right-handed fermions to a scalar doublet Higgs boson (f>. 

Cy = -Y^[Gf<j>fUR]+h.c. 
f 



(1.5) 
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The Lagrangian for the Higgs field is 



where the (unknown) Higgs potential is commonly written as 






The vacuum state corresponds to a Higgs expectation value^ 



(</') = (</>*)=(“) 



with 




( 1 . 6 ) 



(1.7) 



( 1 . 8 ) 



The non-zero expectation value and hence the spontaneous breakdown of 
the electroweak gauge symmetry generates masses for the gauge bosons 
(through the Higgs kinetic term in (1.6) and fermions (through (1.5)). In 
a mass eigenstate basis, the charged IT-bosons (IT^ = {W^ ± iW"^)l^/2) 
receive masses 



Mw = 



(1.9) 



The neutral gauge bosons are defined by 



gWl - g'B^ ^ _ g'W^ + gB^ 
\/ 5 ^ + 5 '^ ^ V 9'^ + 9'^ 



( 1 . 10 ) 



with masses 



Mz = \/ 9 “^ + 3 '^ V = Mw / cos 6w 

v2 

where the weak mixing angle is defined by 

sin 9w = g'/ \/g'^ + 9 '^ 



= 0 



( 1 . 11 ) 



( 1 . 12 ) 



Fermion masses are 

rri{ = GfV. (1-13) 

As one can see, there is a direct relationship between particle masses and 
the Higgs expectation value, v. Indeed, we know from (1.9) and (1.11) 
that V ~ Mw ~ 0(100) GeV. We can then pose the question, why is 
Mw Mp = 1.2 X 10^® GeV or equivalently why is Op On? 



^Note that the convention used here differs by a factor of \/2 from that in much of the 
standard model literature. This is done so as to conform with the MSSM conventions 
used below. 
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1.2 The hierarchy problem 

The mass hierarchy problem stems from the fact that masses, in particu- 
lar scalar masses, are not stable to radiative corrections [3]. While fermion 
masses also receive radiative corrections from diagrams of the form in 
Figure 1, these are only logarithmically divergent (see for example [5]), 

Sirif ~ ln(A^/mf) (1-14) 

47T 

A is an ultraviolet cutoff, where we expect new physics to play an impor- 
tant role. As one can see, even for A ~ Mp, these corrections are small, 
Sm{ < mf. 




Fig. 1. 1-loop correction to a fermion mass. 

In contrast, scalar masses are quadratically divergent. 1-loop contribu- 
tions to scalars masses, such as those shown in Figure 2 are readily computed 

Sml^gf,g^xJd^k^^o(^^)A^ (1.15) 

due to contributions from fermion loops with coupling g{, from gauge boson 
loops with coupling g^, and from quartic scalar-couplings A. From the 
relation (1.9) and the fact that the Higgs mass is related to the expectation 
value, TOp = 4v^A, we expect Mw ~ itt-b.- However, if new physics enters 
in at the GUT or Planck scale so that A M\v, the 1-loop corrections 
destroy the stability of the weak scale. That is, 

A M\v ^ 6m^ mp. (1-16) 

Of course, one can tune the bare mass mu so that it contains a large negative 
term which almost exactly cancels the 1-loop correction leaving a small 
electroweak scale mass^. For a Planck scale correction, this cancellation 
must be accurate to 32 significant digits. Even so, the 2-loop corrections 
should be of order a^A^ so these too must be accurately canceled. Although 
such a series of cancellations is technically feasible, there is hardly a sense 
of satisfaction that the hierarchy problem is under control. 

An alternative and by far simpler solution to this problem exists if one 
postulates that there are new particles with similar masses and equal cou- 
plings to those responsible for the radiatively induced masses but with a 
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Fig. 2. 1-loop corrections to a scalar mass. 



difference (by a half unit) in spin. Then, because the contribution to Sm^ 
due to a fermion loop comes with a relative minus sign, the total contribu- 
tion to the 1-loop corrected mass^ is 

Sml ~ O (A^ -h ml) - O (A^ +ml) = 0 (m| - m|). 

(1-17) 

If in addition, the bosons and fermions all have the same masses, then the 
radiative corrections vanish identically. The stability of the hierarchy only 
requires that the weak scale is preserved so that we need only require that 

|w| -m|| < 1 TeV^. (1.18) 

As we will see in the lectures that follow, supersymmetry offers just the 
framework for including the necessary new particles and the absence of 
these dangerous radiative corrections [6]. 

Before we embark, I would like to call attention to some excellent ad- 
ditional resources on supersymmetry. These are the classic by Bagger and 
Wess on supersymmetry [7], the book by Ross on Grand Unification [8] and 
two recent reviews by Martin [9] and Ellis [10]. 

1.3 Supersymmetric operators and transformations 

Prior to the introduction of supersymmetry, operators were generally re- 
garded as bosonic. That is, they were either scalar, vector, or tensor op- 
erators. The momentum operator, is a common example of a vector 
operator. However, the types of bosonic charges are greatly limited, as was 
shown by Coleman and Mandula [11]. Given a tensorial operator, its 
diagonal matrix elements can be decomposed as 

{a\T,f,^\a) = -k (1.19) 

One can easily see that unless a = 0, 2 to 2 scattering process allow only 
forward scattering. 

Operators of the form expressed in (1.1) however, are necessarily non- 
diagonal as they require a change between the initial and final state by 
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at least a half unit of spin. Indeed, such operators, if they exist must be 
fermionic in nature and carry a spinor index Qq- There may in fact be 
several such operators, with i = (though for the most part we 

will restrict our attention to = 1 here). As a symmetry operator, Q 
must commute with the Hamiltonian H, as must its anti-commutator. So 
we have 

[Ql,,H] = 0 [{Ql^,Q^^},H] = 0. (1.20) 

By extending the Coleman-Mandula theorem [12], one can show that 

{Q\Q^^(xS^^P^ + Z^^ ( 1 . 21 ) 

where is antisymmetric in the supersymmetry indices {i,j}- Thus, this 
so-called “central charge” vanishes for = 1. More precisely, we have in a 
Weyl basis 

{QajQ^ls} = 

{Qc,Q/3} = {qIq^0} = 0 

[Qc,P^.] = [Q^^,P^] = 0 . ( 1 . 22 ) 

Before setting up the formalism of the supersymmetric transformations, it 
will be useful to establish some notation for dealing with spinors in the 
Dirac and Weyl bases. The Lagrangian for a four-component Dirac fermion 
with mass M, can be written as 



where 



£d = — — M'I'd'I'd 




(1.23) 

(1.24) 



and (Tfi = (l,(Ji), 'W^ = (1,— (Ji), Gi are the ordinary 2x2 Pauli matrices. 
I am taking the Minkowski signature to be (— , -I-, -I-, -I-). We can write the 
Dirac spinor 'I'd in terms of 2 two-component Weyl spinors 



'I'D= 'I'D = (X“ (1.25) 

Note that the spinor indices {a, a) are raised and lowered by where {ij} 
can be either both dotted or both undotted indices, e is totally antisym- 
metric and €ij = — eb with = 1. It is also useful to define projection 
operators, P\^ and Pr, with 



(1 - 75) 



4'd = 



1 0 
0 0 



4'n = 



Pl'E'd 



2 



(1.26) 




K.A. Olive: Supersymmetry 



229 



with a similar expression for Pr. In this way we can interpret as a left- 
handed Weyl spinor and as a right-handed Weyl spinor. The Dirac 
Lagrangian (1.23) can now be written in terms of the two-component Weyl 
spinors as 

>Cd = ~ (1-27) 

having used the identity, — = ^^a^x- 

Instead, it is sometimes convenient to consider a four-component 
Majorana spinor. This can be done rather easily from the above conventions 
and taking ^ = y, so that 

^M = (r (1.28) 

and the Lagrangian can be written as 

1 

£m = — 

= + (1.29) 

The massless representations for supersymmetry are now easily constructed. 
Let us consider here = 1 supersymmetry, i.e., a single supercharge 
Qa- For the massless case, we can choose the momentum to be of the 
form = ^(—1,0, 0,1). As can be readily found from the anticommu- 
tation relations (1.22), the only non- vanishing anticommutation relation is 
{Qi,Q\} = 1. Consider then a state of given spin, |A) such that |A) = 0. 
(If it is not 0, then due to the anticommutation relations, acting on it again 
with Q\ will vanish.) From the state |A), it is possible to construct only 
one other nonvanishing state, namely Qi|A) - the action of any of the other 
components of Qa will vanish as well. Thus, if the state |A) is a scalar, then 
the state Qi|A) will be a fermion of spin 1/2. This (super)multiplet will be 
called a chiral multiplet. If |A) is spin 1/2, then Qi|A) is a vector of spin 1, 
and we have a vector multiplet. In the absence of gravity (super gravity), 
these are the only two types of multiplets of interest. 

For > 1, one can proceed in an analogous way. For example, with 
N = 2, we begin with two supercharges Q^,Q'^. Starting with a state 
|A), we can now obtain the following: Qi|A), Q\\\),Q\QW\)- In this case, 
starting with a complex scalar, one obtains two fermion states, and one 
vector, hence the vector (or gauge) multiplet. One could also start with 
a fermion state (say left-handed) and obtain two complex scalars, and a 
right-handed fermion. This matter multiplet however, is clearly not chiral 
and is not suitable for phenomenology. This problem persists for all super- 
symmetric theories with > 1, hence the predominant interest in A = 1 
supersymmetry. 
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Before we go too much further, it will be useful to make a brief connection 
with the standard model. We can write all of the standard model fermions 
in a two-component Weyl basis. The standard model fermions are therefore 



Qz = 






L 






Cl. 



t 



c 

L 



d'i = dl, 



■SL: 



b 



c 

L 



L, = 









Ml. 



t-l- 



(1.30) 



Note that the fields above are all left-handed. Color indices have been 
suppressed. From (1.29), we see that we would write the fermion kinetic 
terms as 



Ain = -iQ\cr^d^Qi - iufa^d^ul (1.31) 



As indicated above and taking the electron as an example, we can form a 
Dirac spinor 







CL 

CR 



(1.32) 



A typical Dirac mass term now becomes 



^'e^'e = ClCl -f ej^CL''' = -f e[cR. (1.33) 



As we introduce supersymmetry, the field content of the standard model 
will necessarily be extended. All of the standard model matter fields listed 
above become members of chiral multiplets in = 1 supersymmetry. Thus, 
to each of the (Weyl) spinors, we assign a complex scalar superpartner. This 
will be described in more detail when we consider the MSSM. 

To introduce the notion of a supersymmetric transformation, let us con- 
sider an infinitesimal spinor with the properties that ^ anticommutes 
with itself and the supercharge Q, but commutes with the momentum 
operator 

{r , = {r, Q/ 3 } = [A. r] = o. (1.34) 

It then follows that since both ^ and Q commute with P^, the combination 
also commutes with or 

[A.CQ] = [A.C^Q^] = o 



(1.35) 
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where by we mean = ^°"Qa = = -ea/3<5^C“ = e/3aQ^C“ = 

Q^. Similarly, ■ Also note that = — ■?a(5“- Finally, we 

can compute the commutator of and 



KQ, 

= - r 

= (1.36) 



We next consider the transformation property of a scalar field, 4>, under the 
infinitesimal ^ 

+ (1-37) 

As described above, we can pick a basis so that Q^^cj) = 0. Let call the spin 
1/2 fermion Qa(j> = V^4’a, so 

( 1 - 38 ) 



To further specify the supersymmetry transformation, we must define the 
action of Q and on ij}. Again, viewing Q as a “raising” operator, we 

will write Qai’-y = —y/2eayF and = —\/2ia!^^ dfj_(f), where F, as we 

will see, is an auxiliary field to be determined later. Even though we had 
earlier argued that Q acting on the spin 1/2 component of the chiral mul- 
tiplet should vanish, we must keep it here, though as we will see, it does 
not correspond to a physical degree of freedom. To understand the action 
of Q\ we know that it must be related to the scalar (j), and on dimen- 
sional grounds (Q^A is of mass dimension 3/2) it must be proportional to 
Then 

= -V2Ceo.yF + V2ta>^.^^^d^(l) 

= V2^^F+V2i{af^^^)ydf,^. (1.39) 

Given these definitions, let consider the successive action of two supersym- 
metry transformations 6^ and Srj. 

Sr,6^4> = V25j^{C4’a) 

= V2[-y2C“77'^e^aP + y2tC“a^77t^9^(/)] . (1.40) 

If we take the difference {6r)5^ — S^Sri)4>, we see that the first term in 
(1.40) cancels if we write ^°‘rj'^eya = — and note that ^°‘rja = rj°‘^cf 
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Therefore the difference can be written as 

{SrfS^ - - ^<7^r]'^)Pf,(j). (1.41) 

In fact it is not difficult to show that (1-41) is a result of the general operator 
relation 

SjjS^ - = 2(r?cr^^'l' - ^a^r]'<)P^. (1.42) 

Knowing the general relation (1.42) and applying it to a fermion ipj will 
allow us to determine the transformation properties of the auxiliary field F. 
Starting with 

= -V2Cecj6^F + (1.43) 

we use the Fierz identity X~fV°'Ca + V-yC°‘Xa + C-yX°‘Va = 0, and making the 
substitutions, Xj = = Vj ^ind C = dutp, we have 

= -V2C£a-f5,jF - (1-44) 

Next we use the spinor identity, along with 

rj'^X'j = X'^V-y (from above) to get 

= -V2^ea-ySr,F + 2i{r]y^'^i^ ~ (1-45) 

It is not hard to see then that the difference of the double transformation 
becomes 

{Srf6^ - = 2{r]a^^'^ - ^a>^7]^)Pfj,ip^ 

+V2{^^S^F -r]^6^F). (1.46) 

Thus, the operator relation (1.42) will be satisfied only if 

S^F = -V2{^^a^P^ij) (1.47) 

and we have the complete set of transformation rules for the chiral multiplet. 

2 The simplest models 

2.1 The massless non-interacting Wess-Zumino model 

We begin by writing down the Lagrangian for a single chiral multiplet con- 
taining a complex scalar field and a Majorana fermion 

C = —dfj,(j>*d^(j) — 

= -d^4>*d^"4> - - df.ip^a^’tp) 



( 2 . 1 ) 
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where the second line in (2.1) is obtained by a partial integration and is 
done to simplify the algebra below. 

We must first check the invariance of the Lagrangian under the super- 
symmetry transformations discussed in the previous section. 

5C = 

+ d^F*a^%p + -^^a''a^%pd^dy(j)* + h.c. (2.2) 

Now with the help of still one more identity, {a^a’' + = —2ri^''S^, 

we can expand the above expression 



SC = 



- d^F*a^^xlj) + h.c. 
V2 



(2.3) 



Fortunately, we now have some cancellations. The first and third terms 
in (2.3) trivially cancel. Using the commutivity of the partial derivative 
and performing a simple integration by parts we see that the second and 
fourth terms also cancel. We left with (again after an integration by parts) 



SC = -iV2S}F*a>^d^ip + h.c. 



(2.4) 



indicating the lack of invariance of the Lagrangian (2.1). 

We can recover the invariance under the supersymmetry transformations 
by considering in addition to the Lagrangian (2.1) the following. 



C 



aux 



F*F 



(2.5) 



and its variation 

SC^^^ = SF*F + F*SF. (2.6) 

The variation of the auxiliary field, F, was determined in (1.47) and gives 

J£aux = iV2^^F*a^df,ilj + h.c. (2.7) 

and exactly cancels the piece left over in (2.4). Therefore the Lagrangian 

C = -d^<t)*d^^(t>-i^^a^^d^,xl} + F*F ( 2 . 8 ) 



is fully invariant under the set of supersymmetry transformations. 
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2.2 Interactions for chiral multiplets 

Our next task is to include interactions for chiral multiplets which are also 
consistent with supersymmetry. We will therefore consider a set of chiral 
multiplets, Fi) and a renormalizable Lagrangian, /lint- Renormaliz- 

ability limits the mass dimension of any term in the Lagrangian to be less 
than or equal to 4. Since the interaction Lagrangian must also be invari- 
ant under the supersymmetry transformations, we do not expect any terms 
which are cubic or quartic in the scalar fields 4>i. Clearly no term can be 
linear in the fermion fields either. This leaves us with only the following 
possibilities 

Ant = AA- + + h.c. (2.9) 

where is some linear function of the (j)i and (/)** and S* is some function 
which is at most quadratic in the scalars and their conjugates. Here, and 
in all that follows, it will be assumed that repeated indices such as ii are 
summed. Furthermore, since AA ~ AA (spinor indices are suppressed), 
the function must be symmetric in ij. As we will see, the functions A 
and B will be related by insisting on the invariance of (2.9). 

We begin therefore with the variation of /lint 

1 /9/lu 1 + 

'^Ant = 2 

+ iA*AA2CA + A2za^Aa^A)A 

-kH*^/2tACT^AA + ^-c. (2.10) 

where the supersymmetry transformations of the previous section have al- 
ready been performed. The notation (A A) refers to V’f Au clearly spinor 
indices have everywhere been suppressed. The Fierz identity (CA)(AA) + 
('?A)(AAfc) + ('?A)(AA) = 0 implies that the derivative of the function 
Ab with respect to (pk (as in the first term of (2.10)) must be symmetric in 
ijk. Because there is no such identity for the second term with derivative 
with respect to rp^ , this term must vanish. Therefore, the function Ab is a 
holomorphic function of the rpi only. Given these constraints, we can write 

A^o =-M^^ ( 2 . 11 ) 

where by (2.9) we interpret as a symmetric fermion mass matrix, and 
ybfc as a set of (symmetric) Yukawa couplings. In fact, it will be convenient 
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to write 



where 



= - 



d'^W 






( 2 . 12 ) 

(2.13) 



and is called the superpotential. 

Noting that the 2nd and 3rd lines of (2.10) are equal due to the symmetry 
of A ''^ , we can rewrite the remaining terms as 



+h.c. (2.14) 



using in addition one of our previous spinor identities on the last line. Fur- 
ther noting that because of our definition of the superpotential in terms of 
Ad, we can write A^^dfi(j)j = —d^{dW/d(j>i). Then the 2nd and last terms 
of (2.14) can be combined as a total derivative if 



B* = 



dW 

d(j>i 



(2.15) 



and thus is also related to the superpotential W. Then the 4th term pro- 
portional to dB'‘ /d(f>^ is absent due to the holomorphic property of W, and 
the definition of B (2.15) allows for a trivial cancellation of the 1st and 3rd 
terms in (2.14). Thus our interaction Lagrangian (2.9) is in fact supersym- 
metric with the imposed relationships between the functions A~^^,B^, and 
the superpotential W. 

After all of this, what is the auxiliary field F? It has been designated as 
an “auxiliary” field, because it has no proper kinetic term in (2.8). It 
can therefore be removed via the equations of motion. Combining the 
Lagrangians (2.8) and (2.9) we see that the variation of the Lagrangian 
with respect to F is 

^ = F^* + W^ (2.16) 

or 

where we can now use the convenient notation that IT* = dW/d(j)i, IT* = 
dW/d(j)^* , and IT*-^ = d‘^W/d4>id(j)j, etc. The vanishing of (2.16) then im- 
plies that 

F, = -W*. (2.17) 

Putting everything together we have 



+ Wi* - w^w*. 



(2.18) 
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As one can see the last term plays the role of the scalar potential 






(2.19) 



2.3 Gauge multiplets 

In principle, we should next determine the invariance of the Lagrangian in- 
cluding a vector or gauge multiplet. To repeat the above exercise performed 
for chiral multiplets, while necessary, is very tedious. Here, I will only list 
some of the more important ingredients. 

Similar to the definition in (1.4), the gauge covariant derivative acting 
on scalars and chiral fermions is 

D^ = dfj,~ igT ■ (2.20) 

where T is the relevant representation of the gauge group. For SU(2), we 
have simply that T* = a'' 12. In the gaugino kinetic term, the covariant 
derivative becomes 

(D^A)“ = -k <7/“'’^ A*- A" (2.21) 

where the are the (antisymmetric) structure constants of the gauge 
group under consideration ([T“,T*'] = z/“^'^T'^). Gauge invariance for a 
vector field, A“, is manifest through a gauge transformation of the form 

+ gr'^^tG>Al ( 2 . 22 ) 

where A is an infinitesimal gauge transformation parameter. To this, we 

must add the gauge transformation of the spin 1/2 fermion partner, or 
gaugino, in the vector multiplet 

= 5/“'"A'AU (2.23) 

Given our experience with chiral multiplets, it is not too difficult to con- 
struct the supersymmetry transformations for the the vector multiplet. 
Starting with A^, which is real, and taking QA^ = 0, one finds that 

^5A“=z[e^a^A“-A“%^C]- (2.24) 

Similarly, applying the supersymmetry transformation to A“, leads to a 
derivative of A“ (to account for the mass dimension) which must be in the 
form of the field strength and an auxiliary field, which is conventionally 
named Thus, 



(2.25) 
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As before, we can determine the transformation properties of by apply- 
ing successive supersymmetry transformations as in (1.42) with the substi- 
tution using (2.20) above. The result is, 

-h . (2.26) 

Also in analogy with the chiral model, the simplest Lagrangian for the vector 
multiplet is 



>Cgauge = - zA“V^i4^A“ + (2.27) 

In (2.27), the gauge kinetic terms are given in general by — 

d,A-+gr<^^AlAl. 

If we have both chiral and gauge multiplets in the theory (as we must) 
then we must make simple modifications to the supersymmetry transfor- 
mations discussed in the previous section and add new gauge invariant 
interactions between the chiral and gauge multiplets which also respect 
supersymmetry. To (1.39), we must only change ^ using (2.20). 
To (1.47), it will be necessary to add a term proportional to (T“^)^lA“^ 
so that, 

S^F = + 2i5(T“^)^U“^ (2.28) 

The new interaction terms take the form 

Ant = V2gi [(</>*T“V')A“ - At“(V’^T»] + g{cj)*T<^(j))DF (2.29) 

Furthermore, invariance under supersymmetry requires not only the addi- 

tional term in (2.28), but also the condition 

= 0. (2.30) 

Finally, we must eliminate the auxiliary field using the equations of 
motion which yield 

= -g{(j)*T‘^(j)). (2.31) 

Thus the “H-term” is also seen to be a part of the scalar potential which 
in full is now. 



F(A^*) = \F^\^ + = |lFf + (2.32) 

Notice a very important property of the scalar potential in supersymmetric 
theories: the potential is positive semi-definite, V > 0. 
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2.4 Interactions 

The types of gauge invariant interactions allowed by supersymmetry are 
scattered throughout the pieces of the supersymmetric Lagrangian. Here, 
we will simply identify the origin of a particular interaction term, its cou- 
pling, and associated Feynmann diagram. In the diagrams below, the arrows 
denote the flow of chirality. Here, ip will represent an incoming fermion, 
and an outgoing one. While there is no true chirality associated with 
the scalars, we can still make the association as the scalars are partnered 
with the fermions in a given supersymmetric multiplet. We will indicate (p 
with an incoming scalar state, and (p* with an outgoing one. 

Starting from the superpotential (2.13) and the Lagrangian (2.18), we 
can identify several interaction terms and their associated diagrams: 

• A fermion mass insertion from W'^^'ipi’ipj 







• A scalar- fermion Yukawa coupling, also from W'^^ipi'ipj 




• A scalar mass insertion from |1U*P 






i 
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• A scalar cubic interaction from (plus its complex conjugate 

which is not shown) 






/ j 




X 



'' k 



• And finally a scalar quartic interaction from 






\ / 

\ / 

N / 

/ N 

\ 

/ \ 

/ \ 



k 



I 



Next we must write down the interactions and associated diagrams for the 
gauge multiplets. The first two are standard gauge field interaction terms in 
any non-abelian gauge theory (so that yf 0) and arise from the gauge 
kinetic term the third is an interaction between the vector and its 
fermionic partner, and arises from the gaugino kinetic term 
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• The quartic gauge interaction from (to be summed over the re- 
peated gauge indices) 



g2 j^abc jade J^d j^e 




• The trilinear gauge interaction also from T?, 




• The gauge-gaugino interaction from a^D^X 
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If our chiral multiplets are not gauge singlets, then we also have interac- 
tion terms between the vectors and the fermions and scalars of the chiral 
multiplet arising from the chiral kinetic terms. Recalling that the kinetic 
terms for the chiral multiplets must be expressed in terms of the gauge co- 
variant derivative (2.20), we find the following interactions, from|Zl^(()p and 

• A quartic interaction interaction involving two gauge bosons and two 
scalars. 







• A cubic interaction involving one gauge boson and two scalars. 



g{A‘^i^{T‘^cj))d^r + h.c.) 



/ 

/ 

/ 

/ 



\ 

\ 

\ 



• A cubic interaction involving one gauge boson and two fermions. 




+ h.c.) 




242 



The Primordial Universe 



Finally, there will be two additional diagrams. One coming from the inter- 
action term involving both the chiral and gauge multiplet, and one coming 
from D^, 

• A cubic interaction involving a gaugino, and a chiral scalar and 
fermion pair, 



+ h.c.) 





V 



• Another quartic interaction interaction involving a gaugino, and a 
chiral scalar and fermion pair. 






\ / 

\ / 

\ y 
\ 

/ \ 

/ \ 



2.5 Supersymmetry breaking 

The world, as we know it, is clearly not supersymmetric. Without much 
ado, it is clear from the diagrams above, that for every chiral fermion of 
mass M, we expect to have a scalar superpartner of equal mass. This is, 
however, not the case, and as such we must be able to incorporate some 
degree of supersymmetry breaking into the theory. At the same time, we 




K.A. Olive: Supersymmetry 



243 



would like to maintain the positive benefits of supersymmetry such as the 
resolution of the hierarchy problem. 

To begin, we must be able to quantify what we mean by supersymmetry 
breaking. From the anti-commutation relations (1.22), we see that we can 
write an expression for the Hamiltonian or P® using the explicit forms of 
the Pauli matrices as 

(2.33) 

a— 1 

A supersymmetric vacuum must be invariant under the supersymmetry 
transformations and therefore would require Q|0) = 0 and Q^|0) = 0 and 
therefore corresponds to P = 0 and also V = 0. Thus, the supersymmetric 
vacuum must have \F\ = \D\ = 0. Conversely, if supersymmetry is sponta- 
neously broken, the vacuum is not annihilated by the supersymmetry charge 
Q so that Q|0) = y and (x|<5|0) = /^, where y is a fermionic field associated 
with the breakdown of supersymmetry and in analogy with the breakdown 
of a global symmetry, is called the Goldstino. For yf 0, (0|P|0) = Vq 0, 
and requires therefore either (or both) |P|yf0or|P|y^0. Mechanisms for 
the spontaneous breaking of supersymmetry will be discussed in the next 
lecture. 

It is also possible that to a certain degree, supersymmetry is explicitly 
broken in the Lagrangian. In order to preserve the hierarchy between the 
electroweak and GUT or Planck scales, it is necessary that the explicit 
breaking of supersymmetry be done softly, i.e., by the insertion of weak 
scale mass terms in the Lagrangian. This ensures that the theory remain 
free of quadratic divergences [13]. The possible forms for such terms are 

/Isoft = 

- ^{AyY^’'(j)^(j>j(j>k + h.c. (2.34) 

where the M“ are gaugino masses, are soft scalar masses, P is a bilinear 
mass term, and A is a trilinear mass term. Masses for the gauge bosons 
are of course forbidden by gauge invariance and masses for chiral fermions 
are redundant as such terms are explicitly present in M*-' already. The 
diagrams corresponding to these terms are 

• A soft supersymmetry breaking gaugino mass insertion 



M“A“A“ 



X 



X 
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• A soft supersymmetry breaking scalar mass insertion 






i 






• A soft supersymmetry breaking bilinear mass insertion 






i 






• A soft supersymmetry breaking trilinear scalar interaction 









X j 



X 

•s 

^ k 



We are now finally in a position to put all of these pieces together and 
discuss realistic supersymmetric models. 

3 The minimal supersymmetric standard model 

To construct the supersymmetric standard model [14] we start with the 
complete set of chiral fermions in (1.30), and add a scalar superpartner to 
each Weyl fermion so that each of the fields in (1.30) represents a chiral 
multiplet. Similarly we must add a gaugino for each of the gauge bosons 
in the standard model making up the gauge multiplets. The minimal su- 
persymmetric standard model (MSSM) [15] is defined by its minimal field 
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content (which accounts for the known standard model fields) and minimal 
superpotential necessary to account for the known Yukawa mass terms. As 
such we define the MSSM by the superpotential 

W = ey [yeHfL^e^ + ydHf (3.1) 

where 

= (3.2) 

In (3.1), the indices, {ij}, are SU{2)\, doublet indices. The Yukawa cou- 
plings, y, are all 3 x 3 matrices in generation space. Note that there is no 
generation index for the Higgs multiplets. Color and generation indices have 
been suppressed in the above expression. There are two Higgs doublets in 
the MSSM. This is a necessary addition to the standard model which can be 
seen as arising from the holomorphic property of the superpotential. That 
is, there would be no way to account for all of the Yukawa terms for both 
up- type and down- type multiplets with a single Higgs doublet. To avoid a 
massless Higgs state, a mixing term must be added to the superpoten- 
tial. 

From the rules governing the interactions in supersymmetry discussed 
in the previous section, it is easy to see that the terms in (3.1) are easily 
identifiable as fermion masses if the Higgses obtain vacuum expectation 
values (vevs). For example, the first term will contain an interaction which 
we can write as 

~\ dWe<^ = -^2/ei?i°(ee^-ke1'e‘=^) (3.3) 

where it is to be understood that in (3.3) that Hi refers to the scalar compo- 
nent of the Higgs H\ and ipi^ and 'ipe^ represents the fermionic component of 
the left-handed lepton doublet and right-handed singlet respectively. Gauge 
invariance requires that as defined in (3.1), H\ has hypercharge Yh^ = — 1 
(and Yh 2 = +1). Therefore if the two doublets obtain expectation values 
of the form 

<"■> = (0) ™ = (,“) 

then (3.3) contains a term which corresponds to an electron mass term with 

rrie = yeVi. (3.5) 

Similar expressions are easily obtained for all of the other massive fermions 
in the standard model. Clearly as there is no state in the minimal model, 
neutrinos remain massless. Both Higgs doublets must obtain vacuum values 
and it is convenient to express their ratio as a parameter of the model. 
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3.1 The Higgs sector 

Of course if the vevs for Hi and H 2 exist, they must be derivable from 
the scalar potential which in turn is derivable from the superpotential and 
any soft terms which are included. The part of the scalar potential which 
involves only the Higgs bosons is 

U = \^iWHlHl + H*H2) + ^g'\H*H2-H*lHlf 

+ ^52 (4|iJ*i?2p - 2{H*iHi){H*H2) + {H*iHif + {H*H2f) 

+mlHlHi + mlH;H 2 + (H/rCy + h.c.). (3.7) 

In (3.7), the first term is a so-called F-term, derived from \{dW/dHi)\'^ and 
\{dW/dH 2 )\'^ setting all sfermion vevs equal to 0. The next two terms are 
F-terms, the first a C/(l)-F-term, recalling that the hypercharges for the 
Higgses are Yh^ = —1 and Yh 2 = 1, and the second is an SU {2)-D-term, 
taking T“ = cr“/2 where cr“ are the three Pauli matrices. Finally, the last 
three terms are soft supersymmetry breaking masses mi and m 2 , and the 
bilinear term Bg,. The Higgs doublets can be written as 

W) = (fl) (3.8) 

and by we mean H^* Hi + H^*H^ etc. 

The neutral portion of (3.7) can be expressed more simply as 

+ {ml + \fi\^)\H°\^ + {BfrH°H° + h.c.). (3.9) 

For electroweak symmetry breaking, it will be required that either one (or 
both) of the soft masses (mi,m|) be negative (as in the standard model). 

In the standard model, the single Higgs doublet leads to one real scalar 
field, as the other three components are eaten by the massive electroweak 
gauge bosons. In the supersymmetric version, the eight components result 
in 2 real scalars (h,H); 1 pseudo-scalar (H); and one charged Higgs (H^); 
the other three components are eaten. Also as in the standard model, one 
must expand the Higgs doublets about their vevs, and we can express the 
components of the Higgses in terms of the mass eigenstates 

vi + -^[H cos a — hs\na + iA sin (5] 

H~ sin [3 
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From the two vevs, we can define v“^ = vf + so that as in 

the standard model. 

In addition, electroweak symmetry breaking places restrictions on the 
set of parameters appearing in the Higgs potential (3.9). If we use the two 
conditions 



dV 



= 0 



dV 



= 0 



(3.11) 



with a little algebra, we can obtain the following two conditions 



—2Bfi = {ml + ml + 2y?) sin 2f} 



(3.12) 



and 

2 4 {m\ + — {ml + fx^) tan^ /?) 

( 5 ^ +5'^)(tan2/3- 1 ) 



(3.13) 



From the potential and these two conditions, the masses of the physical 
scalars can be obtained. At the tree level. 



2 2,2 

mj£± — + m-yy 



‘H,h 



The Higgs 



m\ = ml + ml + 2/x^ = — H^(tan (3 + cot (3) 

_ 1 
“ 2 

mixing angle is defined by 




tan 2a = tan 2/3 



m\ + ml 

2 T 
m\ — 



(3.14) 

(3.15) 

(3.16) 



(3.17) 



Notice that these expressions and the above constraints limit the number of 
free inputs in the MSSM. First, from the mass of the pseudoscalar, we see 
that Bfj, is not independent and can be expressed in terms of ruA and tan /3. 
Furthermore from the conditions (3.12) and (3.13), we see that if we keep 
tan /3, we can either either choose ruA and /i as free inputs thereby deter- 
mining the two soft masses, mi and m 2 , or we can choose the soft masses 
as inputs, and fix mA and /r by the conditions for electroweak symmetry 
breaking. Both choices of parameter fixing are widely used in the literature. 

The tree level expressions for the Higgs masses make some very definite 
predictions. The charged Higgs is heavier than Mw, and the light Higgs 
h, is necessarily lighter than Mz. Note if uncorrected, the MSSM would 
already be excluded (see discussion on current accelerator limits in Sect. 6). 
However, radiative corrections to the Higgs masses are not negligible in 
the MSSM, particularly for a heavy top mass mt ~ 175 GeV. The leading 
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one-loop corrections to depend quartically on mt and can be expressed 
as [16] 




where ^ are the physical masses of the two stop squarks to be dis- 
cussed in more detail shortly, At = At + /rcot/3, {At is supersymmetry 
breaking trilinear term associated with the top quark Yukawa coupling). 
The functions h and / are 

Additional corrections to coupling vertices, two-loop corrections and 
renormalization-group resummations have also been computed in the 
MSSM [17]. With these corrections one can allow 

mh < 130 GeV (3.20) 



within the MSSM. While certainly higher than the tree level limit of Mz, 
the limit still predicts a relatively light Higgs boson, and allows the MSSM 
to be experimentally excluded (or verified!) at the LHC. 

Recalling the expression for the electron mass in the MSSM (3.5), we 
can now express the Yukawa coupling j/e in terms of masses and (tan)/3. 




There are similar expressions for the other fermion masses, with the replace- 
ment cos [3 sin /3 for up- type fermions. 



3.2 The sfermions 

We turn next to the question of the sfermion masses [18]. As an example, 
let us consider the u mass^ matrix. Perhaps the most complicated entry 
in the mass^ matrix is the L — L component. To begin with, there is a 
soft supersymmetry breaking mass term, TOq. In addition, from the su- 
perpotential term, UuH 2 Qu‘^, we obtain an A-term contribution by taking 
dW jdvT = yuH 2 Q- Inserting the vev for H 2 , we have in the F-term, 

IdW/du^'l"^ = \yuV2u\‘^ = ml\u\‘^. (3.22) 

This is generally negligible for all but third generation sfermion masses. 
Next we have the F-term contributions. Clearly to generate a, u*u term. 
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we need only consider the D-term contributions from diagonal generators, 
i.e., and Y, that is from 





(3.23) 


D' = -i5'[|i70p-|i7?P+FQ|R|2 + ...] 


(3.24) 



where 1 q = 2qu — 1 = 1/3 is the quark doublet hypercharge. Once again, 
inserting vevs for Hi and H 2 and keeping only relevant terms, we have for 
the H-term contribution 

i (5 ^u^cos 2/3- g'^u^cos2/3(2q'u - 1)^ {!*{( 

= M| cos 2/3 — q-a sin^ u*u. (3.25) 

Thus the total contribution to the L — L component of the up-squark mass^ 
matrix is 

+ -^z cos 2/3 — <7u sin^ • (3.26) 

Similarly it is easy to see that the R — R component can be found from 
the above expressions by discarding the S'[/(2 )l H-term contribution and 
recalling that = 2q^. Then, 

cos 2/3 (^u sin^ 0w) • (3.27) 

There are, however, in addition to the above diagonal entries, off-diagonal 
terms coming from a supersymmetry breaking A-term, and an A-term. The 
A-term is quickly found from AQy-a_H 2 Qu^ when setting the vev for H 2 = V 2 
and yields a term Aqiit,^. The A-term contribution comes from dW/dH 2 = 
yHi + UmQu^. When inserting the vev and taking the square of the latter 
term, and keeping the relevant mass term, we find for the total off-diagonal 
element 

= ”^u(Aq -I- / icot/3) = muAq. (3.28) 

Note that for a down-type sfermion, the definition of A is modified by taking 
cot j3 tan /3. Also note that the off-diagonal term is negligible for all but 
the third generation sfermions. 

Finally to determine the physical sfermion states and their masses we 
must diagonalize the sfermion mass matrix. This mass matrix is easily 
diagonalized by writing the diagonal sfermion eigenstates as 

fi = /l cos 6»f -k /r sin 0f , 

h = -/Lsin0f -k /Rcos0f. (3.29) 
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With these conventions we have the diagonalizing angle and mass 
eigenvalues 



6»f = sign[-m^R] 



f - i arctan - m^)|, 

i arctan |2mLj^/(mL — w^)|, 



ml > m|, 
ml < m|, 



nil 9 = - 






n — miy + 4mf 



LR 



(3.30) 



Here 9{ is chosen so that m\ is always lighter that m 2 - Note that in the 
special case mn = tor, = to, we have 9{ = sign[— TOlr](7t/4) and 2 = 
TO^ T \ml^\. 



3.3 Neutralinos 



There are four new neutral fermions in the MSSM which not only receive 
mass but mix as well. These are the gauge fermion partners of the neutral 
B and gauge bosons, and the partners of the Higgs. The two gauginos 
are called the bino, B, and wino, respectively. The latter two are the 
Higgsinos, Hi and i? 2 - In addition to t he si^ersymmetry breaking gaug- 
ino mass terms, —^MiBB, and —^M 2 W^W^, there are supersymmetric 
mass contributions of the type giving a mixing term between 

Hi and H 2 , ^^HiH 2 , as well as terms of the form giving 

the following mass terms after the appropriaj^ Higgs vevs have been in- 
serted, -^g'viHiB, ~^g'v 2 H 2 B, -^gviHiW^, and -^gv 2 H 2 W^. These 
latter terms can be written in a simpler form noting that for example, 
g'vij\/2 = Mz sin 0\v cos /3. Thus we can write the neutralino mass matrix 
as (in the (B,W^,HI,H 2 ) basis) [19] 



/Ml 0 -Mzse^cos/3 Mzse^sinP ^ 

0 M 2 Mzce^cos/3 -MzcswSin/3 

-Mzse^cos(3 Mzcg^ cos j3 0 -g 

V Mzse^sin/? -Mzce^sinP -g 0 J 

(3.31) 

where sg^ = sin^w and cg^ = cos^w- The mass eigenstates (a linear 
combination of the four neutralino states) and the mass eigenvalues are 
found by diagonalizing the mass matrix (3.31). However, by a change of 
basis involving two new states [19] 



S'® = Hi sin (3 + H 2 cos [3 



(3.32) 



H® = —Hi cos (3 + Hi sin [3 



(3.33) 
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the mass matrix simplifies and becomes (in the {B,W^,A,S) basis) 

( Ml 0 Mzsg^ 0 

0 M 2 —Mzcg^ 0 

Mzse^ -Mzce^ ^ sin 2/3 ^ cos 2/3 

Y 0 0 /X cos 2/3 —/X sin 2/3 

In this basis, the eigenstates (which as one can see depend only the the 
three input mass, Mi, M 2 , and /x) can be solved for analytically. 

Before moving on to discuss the chargino mass matrix, it will be useful 
for the later discussion to identify a few other neutralino states. These are 

the photino, ^ 

7 = sin 0w + 33 cos 6 >w (3.35) 

and a symmetric and antisymmetric combination of Higgs bosons, 

i?(i2) = ^ (i?i + H 2 ) (3.36) 

i3[i2] = ^ (i?i - H 2 ) . (3.37) 



(3.34) 



3.4 Charginos 



There are two new charged fermionic states which are the partners of the 
gauge Jxosons and the charged Higgs scalars, , which are the charged 
gauginos, and charged Higgsinos, H^, or collectively charginos. The 
chargino mass matrix is composed similarly to the neutralino mass matrix. 
The result for the mass term is 



1 ( M 2 v^mwsin/3 

2 ' ’ I V^mw cos [3 ^ 



(!■•) 



+ h.c. (3.38) 



Note that unlike the case for neutralinos, two unitary matrices must be 
constructed to diagonalize (3.38). The result for the mass eigenstates of the 
two charginos is 



2 2 

777^ , mr^ 

Cl C2 






T 



2Mi 



w 



ij (M| + ^2 + 2M^)2 - 4(^M2 - M^ sin 2/3)2] -(3.39) 



3.5 More supersymmetry breaking 

As was noted earlier, supersymmetry breaking can be associated with a 
positive vacuum energy density, V >0. Clearly from the definition of 
the scalar potential, this can be achieved if either (or both) the F-terms 
or the F-terms are non-zero. Each of the these two possibilities is called 
F-breaking and F-breaking respectively (for obvious reasons). 
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3.5.1 H-Breaking 

One of the simplest mechanisms for the spontaneous breaking of supersym- 
metry, proposed by Fayet and Illiopoulos [20], involves merely adding to the 
Lagrangian a term proportional to D, 



Cm = kD. (3.40) 

It is easy to see by examining (2.26) that this is possible only for a U{1) 
gauge symmetry. For a U(l), the variation of (3.40) under supersymmetry 
is simply a total derivative. The scalar potential is now modified 

V{D) = ~^\D\^ -kD- g{qi(jC* cj>i)D (3.41) 

where qt is the C/(l) charge of the scalar (f>i. As before, the equations of 
motion are used to eliminate the auxiliary field D to give 

D = -K- g{qi4)^* (3.42) 

So long as the U{1) itself remains unbroken (and the scalar fields (j>i do not 
pick up expectation values), we can set = 0, and hence 

{D) = -K (3.43) 

with 

U = (3.44) 

and supersymmetry is broken. Unfortunately, it is not possible that D- 
breaking of this type occurs on the basis of the known U{1) in the standard 
model, i.e., [7(1)y, as the absence of the necessary mass terms in the su- 
perpotential would not prevent the known sfermions from acquiring vevs. 
It may be possible that some other t/(l) is responsible for supersymme- 
try breaking via H-terms, but this is most certainly beyond the context of 
MSSM. 

3.5.2 F-Breaking 

Although F-type breaking also requires going beyond the standard model, 
it does not do so in the gauge sector of the theory. F-type breaking can 
be achieved relatively easily by adding a few gauge singlet chiral multiplets, 
and one the simplest mechanisms was proposed by O’Raifertaigh [21]. In 
one version of this model, we add three chiral supermultiplets. A, B, and C, 
which are coupled through the superpotential 



W = aAB^ + !3C{B^ - m^). 



(3.45) 
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The scalar potential is easily determined from (3.45) 

F = \Fa\^ + \Fb? + \Fc\^ 

= \aB^\'^ + \2B{aA + !3C)\^ + \!3{B^ (3.46) 

Clearly, the first and third terms of this potential can not be made to 
vanish simultaneously, and so for example, ii B = Q, Fq ^ Q, V >0, and 
supersymmetry is broken. 

It is interesting to examine the fermion mass matrix for the above toy 
model. The mass matrix is determined from the superpotential through 
W^'^ipiipj and in the (A, B, C) basis gives 

/ 0 2aB 0 \ 

2aB 2{aA + (3C) 2^B . (3.47) 

V 0 2f3B 0 J 

The fact that the determinant of this matrix is zero, indicates that there is 
at least one massless fermion state, the Goldstino. 

The existence of the Goldstino as a signal of supersymmetry breaking 
was already mentioned in the previous section. It is relatively straightfor- 
ward to see that the Goldstino can be directly constructed from the F- 
and £)-terms which break supersymmetry. Consider the mass matrix for a 
gaugino A“, and chiral fermion 'tpi 

( 0 V2g{{r)T 

\V2g{{<^*)Ty (ITb) 

where we do not assume any supersymmetry breaking gaugino mass. Con- 
sider further, the fermion 

G=((Z7“)/y2,(F,)) (3.49) 

in the (A, tp) basis. Now from the condition (2.30) and the requirement that 
we are sitting at the minimum of the potential so that 

^ = 0^ g{{cP*)Ty + F,W*^ =0 (3.50) 

d<j>j 

we see that the fermion G is massless, that is, it is annihilated by the mass 
matrix (3.48). The Goldstino state G is physical so long as one or both 
(D) yf 0, or {F) 0. This is the analog of the Goldstone mechanism for 

the breaking of global symmetries. 

3.6 R-parity 

In defining the supersymmetric standard model, and in particular the min- 
imal model or MSSM, we have limited the model to contain a minimal field 



(3.48) 
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content. That is, the only new fields are those which are required by su- 
persymmetry. In effect, this means that other than superpartners, only the 
Higgs sector was enlarged from one doublet to two. However, in writing 
the superpotential (3.1), we have also made a minimal choice regarding in- 
teractions. We have limited the types of interactions to include only those 
required in the standard model and its supersymmetric generalization. 

However, even if we stick to the minimal field content, there are several 
other superpotential terms which we can envision adding to (3.1) which are 
consistent with all of the symmetries (in particular the gauge symmetries) 
of the theory. For example, we could consider 

TUr = (3.51) 

In (3.51), the terms proportional to A, A', and qi' , all violate lepton number 
by one unit. The term proportional to X" violates baryon number by one 
unit. 

Each of the terms in (3.51) predicts new particle interactions and can 
be to some extent constrained by the lack of observed exotic phenomena. 
However, the combination of terms which violate both baryon and lepton 
number can be disastrous. For example, consider the possibility that both 
A' and A" were non-zero. This would lead to the following diagram which 
mediates proton decay, p e~^TT^ , ,vK~^ etc. Because of the 




Fig. 3. R-parity violating contribution to proton decay. 



necessary antisymmetry of the final two flavor indices in A", there can be 
no d‘^ exchange in this diagram. The rate of proton decay as calculated 
from this diagram will be enormous due to the lack of any suppression 
by superheavy masses. There is no GUT or Planck scale physics which 
enters in, this is a purely (supersymmetric) standard model interaction. 
The (inverse) rate can be easily estimated to be 



F 



-1 

p 




10® GeV^ 



(3.52) 
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assuming a supersymmetry breaking scale of fh of order 100 GeV. This 
should be compared with current limits to the proton life-time of 
> 10®3 GeV"k 

It is possible to eliminate the unwanted superpotential terms by impos- 
ing a discrete symmetry on the theory. This symmetry has been called 
i?-parity [22], and can be defined as 






3B+L+2S 



(3.53) 



where B, L, and s are the baryon number, lepton number, and spin respec- 
tively. With this definition, it turns out that all of the known standard 
model particles have i?-parity -1-1. For example, the electron has B = 0, 
L = —1, and s = 1/2, the photon as B = L = 0 and s = 1. In both 
cases, R= 1. Similarly, it is clear that all superpartners of the known stan- 
dard model particles have R = —1, since they must have the same value of 
B and L but differ by 1/2 unit of spin. If i?-parity is exactly conserved, 
then all four superpotential terms in (3.51) must be absent. But perhaps 
an even more important consequence of i?-parity is the prediction that the 
lightest supersymmetric particle or LSP is stable. In much the same way 
that baryon number conservation predicts proton stability, i?-parity pre- 
dicts that the lightest i? = — 1 state is stable. This makes supersymmetry 
an extremely interesting theory from the astrophysical point of view, as the 
LSP naturally becomes a viable dark matter candidate [19,23]. This will 
be discussed in detail in the 6th lecture. 



4 The constrained MSSM and supergravity 

As a phenomenological model, while the MSSM has all of the ingredients 
which are necessary, plus a large number of testable predictions, it contains 
far too many parameters to pin down a unique theory. Fortunately, there are 
a great many constraints on these parameters due to the possibility of exotic 
interactions as was the case for additional i?-violating superpotential terms. 
The supersymmetry breaking sector of the theory contains a huge number of 
potentially independent masses. However, complete arbitrariness in the soft 
sfermion masses would be a phenomenological disaster. For example, mixing 
in the squark sector, would lead to a completely unacceptable preponderance 
of flavor changing neutral currents [24] . 

Fortunately, there are some guiding principles that we can use to relate 
the various soft parameters which not only greatly simplifies the theory, 
but also removes the phenomenological problems as well. Indeed, among the 
motivations for supersymmetry was a resolution of the hierarchy problem [3] . 
We can therefore look to unification (grand unification or even unification 
with gravity) to establish these relations [25]. 
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The simplest such hypothesis is to assume that all of the soft supersym- 
metry breaking masses have their origin at some high energy scale, such 
as the GUT scale. We can further assume that these masses obey some 
unification principle. For example, in the context of grand unification, one 
would expect all of the members of a given GUT multiplet to receive a 
common supersymmetry breaking mass. For example, it would be natural 
to assume that at the GUT scale, all of the gaugino masses were equal. 
Ml = M 2 = M 3 (the latter is the gluino mass). While one is tempted to 
make a similar assumption in the case of the sfermion masses (and we will 
do so), it is not as well justified. While one can easily argue that sfermions 
in a given GUT multiplet should obtain a common soft mass, it is not as 
obvious why all i? = — 1 scalars should receive a common mass. 

Having made the assumption that the input parameters are fixed at 
the GUT scale, one must still calculate their values at the relevant low 
energy scale. This is accomplished by “running” the renormalization group 
equations [26]. Indeed, in standard (non-supersymmetric) GUTs, the gauge 
couplings are fixed at the unification scale and run down to the electroweak 
scale. Gonversely, one can use the known values of the gauge couplings and 
run them up to determine the unification scale (assuming that the couplings 
meet at a specific renormalization scale). 



4.1 RG evolution 



To check the prospects of unification in detail requires using the two-loop 
renormalization equations 



dOj 

dt 



1 

47T 




(4.1) 



where t = ln(MQurp/Q^), and the bi are given by 




from gauge bosons, Ng matter generations and Nn Higgs doublets, respec- 
tively, and at two loops 
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These coefficients depend only on the light particle content of the theory. 

However, using the known inputs at the electroweak scale, one finds [4] 
that the couplings of the standard model are not unified at any high energy 
scale. This is shown in Figure 4. 




Fig. 4. RG evolution of the inverse gauge couplings in the standard model [4, 10]. 



In the MSSM, the additional particle content changes the slopes in the 
RGE evolution equations. Including supersymmetric particles, one finds [27] 
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In this case, it is either a coincidence, or it is rather remarkable that the RG 
evolution is altered in just such a way as to make the MSSM consistent with 
unification. The MSSM evolution is shown in Figure 5 below. For many. 
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Fig. 5. RG evolution of the inverse gauge couplings in the MSSM [4, 10]. 



this concordance of the gauge couplings and GUTs offers strong motivation 
for considering supersymmetry. 

As was noted earlier, most of the parameters in the MSSM are also 
subject to RG evolution. For example, all of the Yukawa couplings also 
run. Here, I just list the RG equation for the top quark Yukawa, 

d«t at / 16 13 \ 

”dT ^ 4^ I 15^^ - 6«t - ctfc H I (4.6) 

where at = jJtl^Tr. This is the leading part of the 1-loop correction. For a 
more complete list of these equations see [28]. These expressions are also 
known to higher order [29]. Note that the scaling of the supersymmetric 
couplings are all proportional to the couplings themselves. That means that 
if the coupling is not present at the tree level, it will not be generated by ra- 
diative corrections either. This is a general consequence of supersymmetric 
nonrenormalization theorems [30] . 

The supersymmetry breaking mass parameters also run. Starting with 
the gaugino masses, we have 



dM, 

dt 



-hiaiMi/dir. 



(4.7) 



Assuming a common gaugino mass, mi /2 at the GUT scale as was dis- 
cussed earlier, these equations are easily solved in terms of the fine structure 
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constants, 



M,{t) 



aijt) 

a*(MGUT) 



(4.8) 



This implies that 



Tfi M2 ^3 

91 92 9i 



(4.9) 



(Actually, in a GUT, one must modify the relation due to the difference 
between the U{1) factors in the GUT and the standard model, so that we 
have Ml = I^Ma.) 

Finally, we have the running of the remaining mass parameters. A few 
examples are: 



V 

dt 

drUgc 

dt 

dA, 

dt 

dt 



3/r^ f 1 

I «2 + g <ai — 0!t — 0!h + • 
12 Oil 



Mt 



5 47t 
1 /16 



47T 



13 



- — I —cx^M^ + 302-^2 + ~rpo.iMi — 6atAt + 



15 



- — ( CX2M2 + —ct\Mi — 3 otAt + • • • ) . 

47t V 5 J 



(4.10) 

(4.11) 

(4.12) 

(4.13) 



4.2 The constrained MSSM 

As the name implies, the constrained MSSM or GMSSM, is a subset of the 
possible parameter sets in the MSSM. In the GMSSM [31,32], we try to 
make as many reasonable and well motivated assumptions as possible. To 
begin with gaugino mass unification is assumed. (This is actually a common 
assumption in the MSSM as well) . Furthermore soft scalar mass unification 
or universality is also assumed. This implies that all soft scalar masses are 
assumed equal at the GUT input scale, so that 

to^(Mgut) = ml- (4.14) 

This condition is applied not only to the sfermion masses, but also to the soft 
Higgs masses, ml 2 as well. By virtue of the conditions (3.12) and (3.13), we 
see that in the GMSSM, /x, and By., (or m^), are no longer free parameters 
since these conditions amount to m|, y?) and y?). Thus 

we are either free to pick mA, y as free parameters (this fixes toi_ 2 , though 
we are usually not interested in those quantities) as in the MSSM, or choose 
wi ,2 (say at the GUT scale) and toa and y become predictions of the model. 
Universality of the soft trilinears, Ai, is also assumed. 

In the GMSSM therefore, we have only the following free input param- 
eters: mi/2,TOo,tan/3, Aq, and the sign of y. We could of course choose 
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phases for some these parameters. In the MSSM and CMSSM, there are 
two physical phases which can be non-vanishing, 0^, and 0 a- If non-zero, 
they lead to various CP violating effects such as inducing electric dipole 
moments in the neutron and electron. For some references regarding these 
phases see [33-35], but we will not discuss them further in these lectures. 




Log,„(Q/GeV) 



Fig. 6. RG evolution of the mass parameters in the CMSSM. I thank Toby Falk 
for providing this figure. 

In Figure 6, an example of the running of the mass parameters in the 
CMSSM is shown. Here, we have chosen mi /2 = 250 GeV, toq = 100 GeV, 
tan /3 = 3, Hq = 0, and /r < 0. Indeed, it is rather amazing that from 
so few input parameters, all of the masses of the supersymmetric particles 
can be determined. The characteristic features that one sees in the figure, 
are for example, that the colored sparticles are typically the heaviest in the 
spectrum. This is due to the large positive correction to the masses due 
to 03 in the RGE’s. Also, one finds that the B, is typically the lightest 
sparticle. But most importantly, notice that one of the Higgs mass^, goes 
negative triggering electroweak symmetry breaking [31]. (The negative sign 
in the figure refers to the sign of the mass^, even though it is the mass of 
the sparticles which are depicted.) In the table below, I list some of the 
resultant electroweak scale masses, for the choice of parameters used in the 
figure. 
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Table 1. Physical mass parameters for mi /2 = 250 GeV, mo = 100 GeV, tan/3 = 
3, Ao = 0, and /i < 0. (These are related but not equal to the running mass 
parameters shown in Fig. 6.) 



particle 


mass 


parameter 


value 


mi 


203 




-391 




144 


Ml 


100 


mo 


190 


M2 


193 


rriq 


412-554 


Ms 


726 


m-xi 


104 


0!3{Mz) 


.123 


m~± 


203 


A's 


163-878 


Xi 








TOh 


93 






rriA 


466 







4.3 Supergravity 

Up until now, we have only considered global supersymmetry. Recall, our 
generator of supersymmetry transformations, the spinor In all of the 
derivations of the transformation and invariance properties in the first two 
sections, we had implicitly assumed that = 0. By allowing ^(x) and 
yf 0, we obtain local supersymmetry or supergravity [36]. It is well 
beyond the means of these lectures to give a thorough treatment of local 
supersymmetry. We will therefore have to content ourself with some general 
remarks from which we can glimpse at some features for which we can expect 
will have some phenomenological relevance. 

First, it is important to recognize that our original Lagrangian for the 
Wess-Zumino model involving a single noninteracting chiral multiplet will 
no longer be invariant under local supersymmetry transformations. New 
terms, proportional to must be canceled. In analogy with gauge, 

theories which contain similar terms and are canceled by the introduction 
of vector bosons, here the terms must be canceled by introducing a new 
spin 3/2 particle called the gravitino. The supersymmetry transformation 
property of the gravitino must be 



cx (4.15) 

Notice that the gravitino carries both a gauge and a spinor index. The 
gravitino is part of an TV = 1 multiplet which contains the spin two graviton. 
In unbroken supergravity, the massless gravitino has two spin components 
(± 3/2) to match the two polarization states of the graviton. 

The scalar potential is now determined by a analytic function of the 
scalar fields, called the Kahler potential, The Kahler potential 
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can be thought of as a metric in field space, 

Zlkin = (4.16) 

where = dK/d(j)i and Ki = dK/dcjf* . In what is known as minimal 
fV = 1 supergravity, the Kahler potential is given by 

K = + ln{K^\Wf) (4.17) 



where W{4>) is the superpotential, k ^ = Mp/\/^ and the Planck mass is 
Mp = 1.2 X 10^® GeV. The scalar potential (neglecting any gauge contribu- 



tions) is [37] 






K\K-^)iK,-3 . 



(4.18) 



For minimal supergravity, we have + W'^jW, Ki = K^(j)i + 

W*jW*, and Thus the resulting scalar potential is 



V{(j), 4>*) = jlU* -k - 3k‘^\W\^ 



(4.19) 



As we will now see, one of the primary motivations for the CMSSM, and 
scalar mass universality comes from the simplest model for local supersym- 
metry breaking. The model [38] involves one additional chiral multiplet 
2 (above the normal matter fields (f>i). Let us consider therefore, a su- 
perpotential which is separable in the so-called Polonyi field and matter 
so that 

W{z,cj,,) = f{z)+g{^i) (4.20) 

and in particular let us choose 

f{z) = n{z + P) (4.21) 

and for reasons to be clear shortly, /? = 2 — -\/3. I will from now on work in 
units such that k = 1. If we ignore for the moment the matter fields (j>, the 
potential for 2 becomes 

U(^, z*) = e^^*/i" [|1 + ^*(^ + P)f- 3|(z + P)f] ■ (4.22) 

It is not difficult to verify that with the above choice of P, the minimum of 

V occurs at (z) = — 1, with V{{z)) = 0. 

Note that by expanding this potential, one can define two real scalar 
fields A and B, with mass eigenvalues, 

m\ = = 2(2 — Vi)rn^^2 (4.23) 



where the gravitino mass is 
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Note also that there is a mass relation, m\ + m\ = which is a 

guaranteed consequence of supertrace formulae in supergravity [37]. Had 
we considered the fermionic sector of the theory, we would now find that 
the massive gravitino has four helicity states ±1/2 and ±3/2. The “longi- 
tudinal” states arise from the disappearance of the goldstino (the fermionic 
partner of 2 in this case) by the superHiggs mechanism, again in analogy 
with the spontaneous breakdown of a gauge symmetry [37-39] . 

We next consider the matter potential from equations (4.20) and (4.19). 
In full, this takes the form [40] 

y = [|g±z*(/(^)±g (/>))|2 

+ l|^ + + gm? - 3|/(^) ± ■ (4.25) 

Here again, I have left out the explicit powers of Mp. Expanding this 
expression, and at the same time dropping terms which are suppressed by 
inverse powers of the Planck scale (this can be done by simply dropping 
terms of mass dimension greater than four), we have, after inserting the vev 
for z [40], 

y = e(4-2V5) [|^+(V3-l)(Ai±5(</))P 
± 1 ^ ± ± g{4’))\'^ — + g{<P)\'^ 

= [-V3M5(/>)±ff*(r))±l||P 

±(M(</||±r|^)±MW 

^ g(4-2V3)|%|2 

d(j) 

±TO 3 / 2 e(^“'^)((/|^ - ^/35 ± h.c.)) ± (4.26) 

This last expression deserves some discussion. First, up to an overall rescal- 
ing of the superpotential, g the first term is the ordinary E-term 

part of the scalar potential of global supersymmetry. The next term, pro- 
portional to TO 3/2 represents a universal trilinear H-term. This can be seen 
by noting that '^(j)dg / d(j) = 3g, so that in this model of supersymmetry 
breaking, H = (3 — V3)ni3/2- Note that if the superpotential contains bi- 
linear terms, we would find B = {2 — V^)nT' 3 / 2 - The last term represents a 
universal scalar mass of the type advocated in the CMSSM, with TOq = Tn '^/2 ■ 
The generation of such soft terms is a rather generic property of low energy 
supergravity models [41]. 
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Before concluding this section, it worth noting one other class of su- 
pergravity models, namely the so-called no-scale supergravity model [42]. 
No-scale supergravity is based on the Kahler potential of the form 

K = - ln(S' -h S'*) - 3 ln(T + T*~ + In jlUp (4.27) 

where the S and T fields are related to the dilaton and moduli fields in string 
theory [43]. If only the T field is kept in (4.27), the resulting scalar potential 
is exactly flat, z. e., U = 0 identically. In such a model, the gravitino mass 
is undetermined at the tree level, and up to some held redefinitions, there 
is a surviving global supersymmetry. No-scale supergravity has been used 
heavily in constructing supergravity models in which all mass scales below 
the Planck scale are determined radiatively [44,45]. 



5 Cosmology 



Supersymmetry has had a broad impact on cosmology. In these last two 
lectures, I will try to highlight these. In this lecture, I will briefly review 
the problems induced by supersymmetry, such as the Polonyi or moduli 
problem, and the gravitino problem. I will also discuss the question of cos- 
mological inflation in the context of supersymmetry. Finally, I will describe 
a mechanism for baryogenesis which is purely supersymmetric in origin. I 
will leave the question of dark matter and the accelerator constraints to the 
last lecture. 

Before proceeding to the problems, it will be useful to establish some of 
the basic quantities and equations in cosmology. The standard Big Bang 
model assumes homogeneity and isotropy, so that space-time can be de- 
scribed by the Friedmann-Robertson- Walker metric which in co-moving co- 
ordinates is given by 



ds^ = —dt^ + R^{t) 



dr^ 

(1 — kr^) 



{de^ 



■ sm' 



9d(j)^) 



(5.1) 



where R{t) is the cosmological scale factor and k is the three-space curvature 
constant {k = 0, -1-1, —1 for a spatially flat, closed or open Universe), k and 
R are the only two quantities in the metric which distinguish it from flat 
Minkowski space. It is also common to assume the perfect fluid form for 
the energy-momentum tensor 



= P9fiu + {p + p)u^Ui, (5.2) 

where is the space-time metric described by (5.1), p is the isotropic 
pressure, p is the energy density and = (1,0, 0,0) is the velocity vector 
for the isotropic fluid. Einstein’s equation yield the Friedmann equation. 







1 2 ^ 1a 



2 



(5.3) 
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and 

(l) + 

where A is the cosmological constant, or equivalently from = 0 



p = _3i7(p + p). 



(5.5) 



These equations form the basis of the standard Big Bang model. 

If our Lagrangian contains scalar fields, then from the scalar field con- 
tribution to the energy-momentum tensor 

= df,(j)d^(j) - ^gf,^dp(j)dP(j) - g^yV (</>) (5.6) 

we can identify the energy density and pressure due to a scalar <p, 

P=\^'^ + \R-\t){V4>f + V{4>) (5.7) 

(5.8) 

In addition, we have the equation of motion, 

dV 

</)+3i70+ — = 0. (5.9) 



Finally, I remind the reader that in the early radiation dominated Universe, 
the energy density (as well as the Hubble parameter) is determined by the 
temperature. 



P = 



7t2 

N—T^ 

30 




(5.10) 



The critical energy density (corresponding to A: = 1), is 



= 3i7^K-2 = 1.88 X 10-2^ g cm-3 (5.11) 



where the scaled Hubble parameter is ho = Hq/100 km Mpc ^ s The 
cosmological density parameter is defined as H = pj Pc- 



5.1 The Polonyi problem 

The Polonyi problem, although based on the specific choice for breaking su- 
pergravity discussed in the previous lecture (Eq. (4.21)), is a generic problem 
in broken supergravity models. In fact, the problem is compounded in string 
theories, where there are in general many such fields called moduli. Here, 
attention will be restricted to the simple example of the Polonyi potential. 
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The potential in equation (4.22) has the property that at its minimum 
occurs at {z) = (-\/3 — 1)M, where M = is the reduced Planck mass. 
Recall that the constant (3 was chosen so that at the minimum, V{{z)) = 

0. In contrast to the value of the expectation value, the curvature of the 
potential at the minimum, is ~ fj?, which as argued earlier is related 
to the gravitino mass and therefore must be of order the weak scale. In 
addition the value of the potential at the origin is of order U(0) ~ 

1. e., an intermediate scale. Thus, we have a long and very flat potential. 

Without too much difficulty, it is straightforward to show that such a 
potential can be very problematic cosmologically [46] . The evolution of the 
Polonyi field z, is governed by equation (5.9) with potential (4.22). There 
is no reason to expect that the field z is initially at its minimum. This 
is particularly true if there was a prior inflationary epoch, since quantum 
fluctuations for such a light field would be large, displacing the field by 
an amount of order M from its minimum. If z yf (z), the evolution of z 
must be traced. When the Hubble parameter H > ij, , z is approximately 
constant. That is, the potential term (proportional to ^^) can be neglected. 
At later times, as H drops, z begins to oscillate about the minimum when 
H < ji. Generally, oscillations begin when H ~ tUz ~ /x as can be seen 
from the solution for the evolution of a non-interacting massive field with 
V = to^z^/ 2. This solution would take the form of z ~ sm{mzt) /t with 
H = 2/3t. 

At the time that the z-oscillations begin, the Universe becomes 
dominated by the potential V{z), since ~ p/M'^. Therefore all other 
contributions to p will redshift away, leaving the potential as the domi- 
nant component to the energy density. Since the oscillations evolve as non- 
relativistic matter (recall that in the above solution for z, H = 2/3t as in a 
matter dominated Universe). As the Universe evolves, we can express the 
energy density as p ~ {Rz / R)^ , where Rz is the value of the scale fac- 
tor when the oscillations begin. Oscillations continue, until the z-fields can 
decay. Since they are only gravitationally coupled to matter, their decay 
rate is given by Pz ~ Therefore oscillations continue until H ^ Tz 

or when R = R^z ~ {M/p)'^/^. The energy density at this time is only 
p^ jM'^. Even if the the thermalization of the decay products occurs rapidly, 
the Universe reheats only to a temperature of order Tr ~ ~ . 

For p ~ 100 GeV, we have Tr ~ 100 keV! There are two grave problems 
with this scenario. The first is is that Big Bang nucleosynthesis would have 
taken place during the oscillations which is characteristic of a matter dom- 
inated expansion rather than a radiation dominated one. Because of the 
difference in the expansion rate the abundances of the light elements would 
be greatly altered (see e.g. [47]). Even more problematic is the entropy 
release due to the decay of these oscillations. The entropy increase [46] is 
related to the ratio of the reheat temperature to the temperature of the 
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radiation in the Universe when the oscillations decay, Td ~ Ti{Rz/ Rdz) 
where R is the temperature when oscillations began R ~ There- 

fore, the entropy increase is given by 

St/S, ~ {Tn/Tdf ~ (M/fi) ~ 10^6. (5.12) 

This is far too much to understand the present value of the baryon-to- 
entropy ratio, of order 10“^^ — 10“^° as required by nucleosynthesis and 
allowed by baryosynthesis. That is, even if a baryon asymmetry of order 
one could be produced, a dilution by a factor of 10^® could not be accom- 
modated. 

5.2 The gravitino problem 

Another problem which results from the breaking of supergravity is the 
gravitino problem [48]. If gravitinos are present with equilibrium number 
densities, we can write their energy density as 

PS /2 = rri3/2n3/2 = TO 3/2 I j U 3/2 (5.13) 

where today one expects that the gravitino temperature T 3/2 is reduced 
relative to the photon temperature due to the annihilations of particles 
dating back to the Planck time [49]. Typically one can expect Y 3/2 = 
( 23 / 2 / 77 )^ ~ 10“^. Then for ^ 13 / 2 ^^ ^ 1? we obtain the limit that 7713/2 ^ 
1 keV. 

Of course, the above mass limits assumes a stable gravitino, the problem 
persists however, even if the gravitino decays, since its gravitational decay 
rate is very slow. Gravitinos decay when their decay rate, T 3/2 ~ my 2 /Mp, 
becomes comparable to the expansion rate of the Universe (which becomes 
dominated by the mass density of gravitinos), H ~ ''tT' 3 ^/ 2 '^ 3/2 /-^p- Thus de- 
cays occur at Td — After the decay, the Universe is “reheated” 

to a temperature 

Tr ~ p(Td)i/4 ~ (5-14) 

As in the case of the decay of the Polonyi fields, the Universe must reheat 
sufficiently so that Big Bang nucleosynthesis occurs in a standard radiation 
dominated Universe. For Tr > 1 MeV, we must require 7713/2 > 20 TeV. 
This large value threatens the solution of the hierarchy problem. In addi- 
tion, one must still be concerned about the dilution of the baryon-to-entropy 
ratio [50], in this case by a factor A = (Tr/Tr)^ ~ U(Mp/to 3 / 2 )^^^. Dilu- 
tion may not be a problem if the baryon-to-entropy ratio is initially large. 

Inflation (discussed below) could alleviate the gravitino problem by di- 
luting the gravitino abundance to safe levels [50]. If gravitinos satisfy the 
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noninflationary bounds, then their reproduction after inflation is never a 
problem. For gravitinos with mass of order 100 GeV, dilution without over- 
regeneration will also solve the problem, but there are several factors one 
must contend with in order to be cosmologically safe. Gravitino decay prod- 
ucts can also upset the successful predictions of Big Bang nucleosynthesis, 
and decays into LSPs (if R-parity is conserved) can also yield too large 
a mass density in the now-decoupled LSPs [19]. For unstable gravitinos, 
the most restrictive bound on their number density comes form the photo- 
destruction of the light elements produced during nucleosynthesis [51] 

n3/2/n^ ^ 10-13(100 GeV/m3/2) (5.15) 

for lifetimes > lO"! s. Gravitinos are regenerated after inflation and one can 
estimate [19,50,51] 

^ 3 / 2 /% ~ (F/i/)(T3/2/T^)3 ~ aN{Tn){Tn/Mp){Ty^/T^f (5.16) 

where F ~ aN {Tji){T^/ Mp) is the production rate of gravitinos. Gombining 
these last two equations one can derive bounds on Tr 

Tr < 4 X 10® GeV(100 GeV/m 3 / 2 ) (5.17) 

using a more recent calculation of the gravitino regeneration rate [52]. A 
slightly stronger bound (by an order of magnitude in Tr) was found in [53] . 

5.3 Inflation 

It would be impossible in the context of these lectures to give any kind of 
comprehensive review of inflation whether supersymmetric or not. I refer 
the reader to several reviews [54] . Here I will mention only the most salient 
features of inflation as it connects with supersymmetry. 

Supersymmetry was first introduced [55] in inflationary models as a 
means to cure some of the problems associated with the fine-tuning of 
new inflation [56]. New inflationary models based on a Goleman- Weinberg 
type of SU (5) breaking produced density fluctuations [57] with magnitude 
6p/p ~ 0 ( 10 ®) rather than 6p/p ~ lO”® as needed to remain consistent with 
microwave background anisotropies. Other more technical problems [58] 
concerning slow rollover and the effects of quantum fluctuations also passed 
doom on this original model. 

The problems associated with new inflation, had to with the interactions 
of the scalar held driving inflation, namely the SU(5) adjoint. One cure 
is to (nearly) completely decouple the held driving inflation, the inflaton, 
from the gauge sector. As gravity becomes the primary interaction to be 
associated with the inflaton it seemed only natural to take all scales to be 
the Planck scale [55]. Supersymmetry was also employed to give flatter 
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potentials and acceptable density perturbations [59]. These models were 
then placed in the context of = 1 supergravity [60,61]. 

The simplest such model for the inflaton ry, is based on a superpotential 
of the form 



IF(ry) = n^{l-T]/MpfMp 


(5.18) 


W (ry) = yt^(?y - vy'^/dMp^) 


(5.19) 



where equation (5.18) [61] is to be used in minimal supergravity while equa- 
tion (5.19) [62] is to be used in no-scale supergravity. Of course the real 
goal is to determine the identity of the inflaton. Presumably string theory 
holds the answer to this question, but a fully string theoretic inflationary 
model has yet to be realized [63]. 

For the remainder of the discussion, it will be sufficient to consider only 
a generic model of inflation whose potential is of the form: 

V{r]) = (5.20) 



where ry is the scalar field driving inflation, the inflaton, yr is an as yet 
unspecified mass parameter, and P(?y) is a function of rj which possesses 
the features necessary for inflation, but contains no small parameters, i.e., 
where all of the couplings in P are 0(1) but may contain non-renormalizable 
terms. 

The requirements for successful inflation can be expressed in terms of 
two conditions: 



1) enough inflation; 

d^V , 87tP(0) 

9ry2 65 " 65Mp^ 



(5.21) 



2) density perturbations of the right magnitude [57]; 



P 



IOtt^/^t) 



0 ( 100 )-^ 

Mp2 



(5.22) 



given here for scales which “re-enter” the horizon during the matter 
dominated era. For large scale fluctuations of the type measured by 
COBE [64], we can use equation (5.22) to fix the inflationary scale 
/i [65]: 



„2 

^ = few X 10"®. (5.23) 

Fixing (yt^/Mp) has immediate general consequences for inflation [66]. For 
example, the Hubble parameter during inflation, ~ (87r/3)(yi"^/M|) so 




270 



The Primordial Universe 



that H ~ 10“^Mp. The duration of inflation is t ~ Mp / and the number 
of e-foldings of expansion is Hr ~ 8ir{Mp / fx^) ~ 10®. If the inflaton decay 
rate goes as T ~ m^/Mp ~ fi^/Mp, the universe recovers at a temperature 
7 r ~ (TMp)^/® ~ n^/Mp ~ 10“^^Mp ~ 10® GeV. However, it was noted 
in [66] that in fact the Universe is not immediately thermalized subsequent 
to inflaton decays, and the process of thermalization actually leads to a 
smaller reheating temperature, 

Tr ~ ~ 10® GeV, (5.24) 

where o? ~ 10“® characterizes the strength of the interactions leading to 
thermalization. This low reheating temperature is certainly safe with re- 
gards to the gravitino limit (5.17) discussed above. 



5. 4 Baryogenesis 



The production of a net baryon asymmetry requires baryon number violat- 
ing interactions, G and GP violation and a departure from thermal equi- 
librium [67]. The first two of these ingredients are contained in GUTs, the 
third can be realized in an expanding universe where it is not uncommon 
that interactions come in and out of equilibrium. 

In the original and simplest model of baryogenesis [68] , a GUT gauge or 
Higgs boson decays out of equilibrium producing a net baryon asymmetry. 
While the masses of the gauge bosons is fixed to the rather high GUT scale 
IfliS”!® GeV, the masses of the triplets could be lighter 0(10^®) GeV and 
still remain compatible with proton decay because of the Yukawa suppres- 
sion in the proton decay rate when mediated by a heavy Higgs. This reduced 
mass allows the simple out-out-equilibrium decay scenario to proceed after 
inflation so long as the Higgs is among the inflaton decay products [69]. 
From the arguments above, an inflaton mass of 10^^ GeV is sufficient to 
realize this mechanism. Higgs decays in this mechanism would be well out 
of equilibrium as at reheating T tor and nn ~ n.^- this case, the 
baryon asymmetry is given simply by 



1 /2 

ns ^ ^ ^ ^ { Uhl] ^ J!_ 

s "" Vr® "" Vr® V^p7 



(5.25) 



where e is the GP violation in the decay, Tr is the reheat temperature after 
inflation, and I have substituted for rir, = Pri/mrj ~ P^Mp^/m,,. 

In a supersymmetric grand unified SU{5) theory, the superpotential Fy 
must be expressed in terms of SU (5) multiplets 



Fy = /idHa 5 10 -h /luHi 10 10 (5.26) 



where 10,5,i7i and H 2 are chiral supermultiplets for the 10 and 5-plets of 
SU{5) matter fields and the Higgs 5 and 5 multiplets respectively. 
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There are now new dimension 5 operators [25, 70] which violate baryon 
number and lead to proton decay as shown in Figure 7. The first of these 
diagrams leads to effective dimension 5 Lagrangian terms such as 

£iS = (5.27) 

and the resulting dimension 6 operator for proton decay [71] 

^ (I5) <™'>' <'*■"** 

As a result of diagrams such as these, the proton decay rate scales as 
r ~ where Mh is the triplet mass, and Mq is a typical gaug- 

ino mass of order < 1 TeV. This rate however is much too large unless 
Mh > IQi® GeV. 





Fig. 7. Dimension 5 and induced dimension 6 graphs violating baryon number. 

It is however possible to have a lighter (0(10^° — 10^^) GeV) Higgs 
triplet needed for baryogenesis in the out-of-equilibrium decay scenario with 
inflation. One needs two pairs of Higgs five-plets {Hi, H 2 and i7[, which 
is anyway necessary to have sufficient G and GP violation in the decays. 
By coupling one pair {H 2 and H[) only to the third generation of fermions 
via [72] 

oHilOlO -b fen'ilOalOa -b cHalOaSa -b dH^lOS (5.29) 
proton decay can not be induced by the dimension five operators. 

5.4.1 The Affleck-Dine mechanism 

Another mechanism for generating the cosmological baryon asymmetry is 
the decay of scalar condensates as first proposed by Affleck and Dine [73]. 
This mechanism is truly a product of supersymmetry. It is straightforward 
though tedious to show that there are many directions in field space such 
that the scalar potential given in equation (2.32) vanishes identically when 
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SUSY is unbroken. That is, with a particular assignment of scalar vacuum 
expectation values, U = 0 in both the F— and D-terms. An example of 
such a direction is 

U 3 = a 82 = a — ui = V = V bl = e*‘^\/u2 + (5.30) 

where a, v are arbitrary complex vacuum expectation values. SUSY break- 
ing lifts this degeneracy so that 

V ~ (5.31) 

where m is the SUSY breaking scale and <j) is the direction in field space 
corresponding to the flat direction. For large initial values of (j), 4>o ~ Afgut, 
a large baryon asymmetry can be generated [73,74]. This requires the 
presence of baryon number violating operators such as O = qqql such that 
(O) yf 0. The decay of these condensates through such an operator can lead 
to a net baryon asymmetry. 

In a supersymmetric gut, as we have seen above, there are precisely these 
types of operators. In Figure 8, a 4-scalar diagram involving the fields of 
the flat direction (5.30) is shown. Again, G is a (light) gaugino, and X is 
a superheavy gaugino. The two supersymmetry breaking insertions are of 
order rh, so that the diagram produces an effective quartic coupling of order 




Fig. 8. Baryon number violating diagram involving flat direction fields. 

The baryon asymmetry is computed by tracking the evolution of the 
sfermion condensate, which is determined by 

0 -H 3i7</> = -mV- (5.32) 

To see how this works, it is instructive to consider a toy model with poten- 
tial [74] 

U(0,0*) = mW + ^*A . 



(5.33) 
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The equation of motion becomes 

(^1 + 3iJ0i = + 5X(j}l4>2 - \(j)l (5.34) 

(j^2 + 37702 = -m^02 - 3A0^0i + \(j)\ (5.35) 

with 0 = (01 +t02)/-\/2. Initially, when the expansion rate of the Universe, 
H, is large, we can neglect 0 and m. As one can see from (5.33) the 
flat direction lies along 0 ~ 0i ~ 0o with 02 ~ 0. In this case, 0i ~ 0 
and 02 ~ 3ij’0o- Since the baryon density can be written as ub = jo = 
5 (01 02 — 0201 ) — ^00 7 by generating some motion in the imaginary 0 
direction, we have generated a net baryon density. 

When H has fallen to order fh (when ~ m), 0i begins to oscillate 
about the origin with 0i ~ <joSin(fht)/fht. At this point the baryon number 
generated is conserved and the baryon density, ub falls as R~^. Thus, 

ub ~ ^0o0^ oc R~^ (5.36) 

m 

and relative to the number density of 0’s (n^ = p^jm = m0^) 



nB ^ X4>1 

Ucj, fh? 



(5.37) 



If it is assumed that the energy density of the Universe is dominated by 0, 
then the oscillations will cease, when 



- 1 1/2 - , 

r ~ ~ w ~ ^ HjZ 

0^ Mp Mp 



(5.38) 



or when the amplitude of oscillations has dropped to 0 d — (Mpm^)^/^. 
Note that the decay rate is suppressed as flelds coupled directly to 0 gain 
masses oc 0. It is now straightforward to compute the baryon to entropy 
ratio. 



riB 



riB 

3/4 



A0O0D 



m 



Ml (Mp 



5 / 203/2 



and after inserting the quartic coupling 



1/6 



(5.39) 



riB 



(M. 



02) V TO y 



1/6 



(5.40) 



which could be 0(1). 

In the context of inflation, a couple of significant changes to the scenario 
take place. First, it is more likely that the energy density is dominated by 
the inflaton rather than the sfermion condensate. The sequence of events 
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leading to a baryon asymmetry is then as follows [66]: After inflation, oscil- 
lations of the inflaton begin at R = Rtj when H ~ niri and oscillations of the 
sfermions begin at i? = when H ^ fh. If the Universe is inflaton dom- 
inated, H ~ mn(Rn/R)^^^ since H ~ and Pr^ ^ rf ^ R~^ Thus one 
can relate Rjj and R^ ~ As discussed earlier, inflatons 

decay when T^ = m^/Mp = H or when R = R^rj — {Mp jmri)‘^^^Rjj. The 
Universe then becomes dominated by the relativistic decay products of the 
inflaton, {R^^ / R^ and H = Mp^^{R^/ R)"^ . Sfermion 

decays still occur when = H which now corresponds to a value of the 
scale factor i?dp = jm)Rrf. The final baryon asymmetry 

in the Affleck-Dine scenario with inflation becomes [66] 



i 4 3/2 7/2 

^ e0o TTirf'^ emr)' no-^-n 



(5.41) 



for TO ~ (10 — 10 ^®)Mp, and Mx ~ (10 ^ — 10 ^)Mp and to,, ~ 

(10-® - 10"^)Mp. 

When combined with inflation, it is important to verify that the AD 
flat directions remain flat. In general, during inflation, supersymmetry is 
broken. The gravitino mass is related to the vacuum energy and TO.3^2 ~ 
V/Mp ~ H^, thus lifting the flat directions and potentially preventing the 
realization of the AD scenario as argued in [75]. To see this, recall the 
minimal supergravity model defined in equations (4.17-4.20). Recall also, 
the last term in equation (4.26), which gives a mass to all scalars (in the 
matter sector), including flat directions of order the gravitino mass which 
during inflation is large. This argument can be generalized to supergravity 
models with non-minimal Kahler potentials. 

However, in no-scale supergravity models, or more generally in models 
which possess a Heisenberg symmetry [76], the Kahler potential can be 
written as (c/. Eq. (4.27)) 



G = /(z + z*-0*0*)+ln|lU(0)|2. 



Now, one can write 



V = 



f 



LV/" 



2__3)|1U|^--|1U, 



(5.42) 



(5.43) 



It is important to notice that the cross term has disappeared in the 

scalar potential. Because of the absence of the cross term, flat directions 
remain flat even during inflation [77]. The no-scale model corresponds to 
/ = — 31n?7, /'^ = 3/" and the first term in (5.43) vanishes. The potential 
then takes the form 



V = 



1 2 
03 






2 



(5.44) 
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which is positive definite. The requirement that the vacuum energy van- 
ishes implies {Wi) = (ga) = 0 at the minimum. As a consequence 77 is 
undetermined and so is the gravitino mass 77x3/2(77). 

The above argument is only valid at the tree level. An explicit one-loop 
calculation [78] shows that the effective potential along the fiat direction 
has the form 



where A is the cutoff of the effective supergravity theory, and has a mini- 
mum around (j) ~ 0.5A. Thus, 4>o ~ Mp will be generated and in this case 
the subsequent sfermion oscillations will dominate the energy density and a 
baryon asymmetry will result which is independent of inflationary parame- 
ters as originally discussed in [73,74] and will produce ne/s ~ 0(1). Thus 
we are left with the problem that the baryon asymmetry in no-scale type 
models is too large [77,79,80]. 

In [80], several possible solutions were presented to dilute the baryon 
asymmetry. These included 1) entropy production from moduli decay, 
2) the presence of non-renormalizable interactions, and 3) electroweak ef- 
fects. Moduli decay in this context, turns out to be insufficient to bring 
an initial asymmetry of order txb/s ~ 1 down to acceptable levels. How- 
ever, as a by-product one can show that there is no moduli problem either. 
In contrast, adding non-renormalizable Planck scale operators of the form 
^2n-2 ^ j^2n-6 Smaller initial value for <j)o and hence a smaller 

value for np/s. For dimension 6 operators (rx = 4), a baryon asymmetry 
of order np/s ^ 10“^° is produced. Finally, another possible suppression 
mechanism is to employ the smallness of the fermion masses. The baryon 
asymmetry is known to be wiped out if the net B — L asymmetry vanishes 
because of the sphaleron transitions at high temperature. However, Kuzmin 
et al. [81] pointed out that this erasure can be partially circumvented if the 
individual {B — 3Li) asymmetries, where x = 1,2,3 refers to three genera- 
tions, do not vanish even when the total asymmetry vanishes. Even though 
there is still a tendency that the baryon asymmetry is erased by the chem- 
ical equilibrium due to the sphaleron transitions, the finite mass of the tau 
lepton shifts the chemical equilibrium between B and L3 towards the B side 
and leaves a finite asymmetry in the end. Their estimate is 




where the temperature T ^ Tq 200 GeV is when the sphaleron transition 
freezes out (similar to the temperature of the electroweak phase transition) 
and 77Xi-(r) is expected to be somewhat smaller than t7Xi-( 0) = 1.777 GeV. 
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Overall, the sphaleron transition suppresses the baryon asymmetry by a 
factor of ~ 10“®. This suppression factor is sufficient to keep the total 
baryon asymmetry at a reasonable order of magnitude in many of the cases 
discussed above. 

6 Dark matter and accelerator constraints 

There is considerable evidence for dark matter in the Universe [82] . The best 
observational evidence is found on the scale of galactic halos and comes from 
the observed flat rotation curves of galaxies. There is also good evidence 
for dark matter in elliptical galaxies, as well as clusters of galaxies coming 
from X-ray observations of these objects. In theory, we expect dark matter 
because 1) inflation predicts U = 1, and the upper limit on the baryon 
(visible) density of the Universe from Big Bang nucleosynthesis is Ub < 0.1 
[83]; 2) Even in the absence of inflation (which does not distinguish between 
matter and a cosmological constant), the large scale power spectrum is 
consistent with a cosmological matter density of U ~ 0.3, still far above 
the limit from nucleosynthesis; and 3) our current understanding of galaxy 
formation is inconsistent with observations if the Universe is dominated by 
baryons. 

It is also evident that not only must there be dark matter, the bulk 
of the dark matter must be non-baryonic. In addition to the problems 
with baryonic dark matter associated with nucleosynthesis or the growth of 
density perturbations, it is very difficult to hide baryons. There are indeed 
very good constraints on the possible forms for baryonic dark matter in our 
galaxy. Strong cases can be made against hot gas, dust, Jupiter size objects, 
and stellar remnants such as white dwarfs and neutron stars [84]. 

In what follows, I will focus on the region of the parameter space in 
which the relic abundance of dark matter contributes a significant though 
not excessive amount to the overall energy density. Denoting by the 
fraction of the critical energy density provided by the dark matter, the 
density of interest falls in the range 

0.1 < < 0.3. (6.1) 

The lower limit in equation (6.1) is motivated by astrophysical relevance. 
For lower values of there is not enough dark matter to play a signifi- 

cant role in structure formation, or constitute a large fraction of the critical 
density. The upper bound in (6.1), on the other hand, is an absolute con- 
straint, derivable from the age of the Universe, which can be expressed as 

Hoto = f da; (l — U — Ua + -I- U/a;) . (6.2) 

Jo 

In (6.2), U is the density of matter relative to critical density, while Ua 
is the equivalent contribution due a cosmological constant. Given a lower 
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bound on the age of the Universe, one can establish an upper bound on 
from equation (6.2). A safe lower bound to the age of the Universe is 
to ^ 12 Gyr, which translates into the upper bound given in (6.1). Adding 
a cosmological constant does not relax the upper bound on so long 

as U + Ua < 1. If indeed, the indications for a cosmological constant from 
recent supernovae observations [85] turn out to be correct, the density of 
dark matter will be constrained to the lower end of the range in (6.1). 

As these lectures are focused on supersymmetry, I will not dwell on the 
detailed evidence for dark matter, nor other potential dark matter can- 
didates. Instead, I will focus on the role of supersymmetry and possible 
supersymmetric candidates. As was discussed at the end of Section 3, one 
way to insure the absence of unwanted, B and L-violating superpotential 
terms, is to impose the conservation of i?-parity. In doing so, we have 
the prediction that the lightest supersymmetric particle (LSP) will be sta- 
ble. It is worth noting that i?-parity conservation is consistent with certain 
mechanisms for generating neutrino masses in supersymmetric models. For 
example, by adding a new chiral multiplet along with superpotential 
terms of the form, H 2 Lv’^ + , although lepton number is violated (by 

two units), i?-parity is conserved. In this way a standard see-saw mechanism 
for neutrino masses can be recovered. 

The stability of the LSP almost certainly renders it a neutral weakly 
interacting particle [19]. Strong and electromagnetically interacting LSPs 
would become bound with normal matter forming anomalously heavy iso- 
topes. Indeed, there are very strong upper limits on the abundances, relative 
of hydrogen, of nuclear isotopes [86], n/rin ^ 10“^® to 10“^® for 1 GeV 
< m < 1 TeV. A strongly interacting stable relics is expected to have an 
abundance n/nn ^ 10“^*^ with a higher abundance for charged particles. 

There are relatively few supersymmetric candidates which are not col- 
ored and are electrically neutral. The sneutrino [87] is one possibility, but in 
the MSSM, it has been excluded as a dark matter candidate by direct [88] 
searches, indirect [89] and accelerator [90] searches. In fact, one can set 
an accelerator based limit on the sneutrino mass from neutrino counting, 
wp > 43 GeV [91]. In this case, the direct relic searches in underground 
low-background experiments require mo > 1 TeV [92] . Another possibility 
is the gravitino which is probably the most difficult to exclude. I will con- 
centrate on the remaining possibility in the MSSM, namely the neutralinos. 

The neutralino mass matrix was discussed in Section 3.3 along with some 
particular neutralino states. In general, neutralinos can be expressed as a 
linear combination 

X = aB + + jHi + SH 2 (6.3) 

and the coefficients a, /?, 7, and <5 depend only on M2, /x, and tan /3 (assuming 
gaugino mass unification at the GUT scale so that Mi = |^M2). 
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There are some limiting cases in which the LSP is nearly a pure state [19]. 
When fj, ^ 0, is the LSP with 

nig ^ = ^sin2/3. (6-4) 

When M 2 0, the photino is the LSP with [23] 






9i 



3 (ffi^ 



- 92 ^) 



Mo- 



(6.5) 



When M 2 is large and M 2 /i then the bino B is the LSP [93] and 

mg ~ Ml (6.6) 



and finally when /i is large and /i <C M 2 the Higgsino states L7(i2) with mass 
mg(i2) = — /i for /X < 0, or L7[i2] with mass = /x for /x > 0 are the 

LSPs depending on the sign of ^ [93]. 

In Figure 9 [93], regions in the M 2 , ^ plane with tan/3 = 2 are shown 
in which the LSP is one of several nearly pure states, the photino, 7, the 
bino, B, a symmetric combination of the Higgsinos, or the Higgsino, 

S. The dashed lines show the LSP mass contours. The cross hatched 
regions correspond to parameters giving a chargino {W^ , H^) state with 
mass my < 45 GeV and as such are excluded by LEP [94]. This constraint 
has been extended by LEP 1.5 [95], and LEP2 [96] and is shown by the 
light shaded region and corresponds to regions where the chargino mass is 
< 95 GeV. The newer limit does not extend deep into the Higgsino region 
because of the degeneracy between the chargino and neutralino. Notice that 
the parameter space is dominated by the B or H 12 pure states and that the 
photino (often quoted as the LSP in the past [23,97]) only occupies a small 
fraction of the parameter space, as does the Higgsino combination S'^. Both 
of these light states are now experimentally excluded. 

The relic abundance of LSP’s is determined by solving the Boltzmann 
equation for the LSP number density in an expanding Universe. The tech- 
nique [98] used is similar to that for computing the relic abundance of mas- 
sive neutrinos [99]. The relic density depends on additional parameters in 
the MSSM beyond M2,/x, and tan/3. These include the sfermion masses, 
mj, the Higgs pseudo-scalar mass, mA, and the tri-linear masses A as well 
as two phases 9^ and 6a- To determine, the relic density it is necessary to 
obtain the general annihilation cross-section for neutralinos. This has been 
done in [100-103]. In much of the parameter space of interest, the LSP 
is a bino and the annihilation proceeds mainly through sfermion exchange 
as shown in Figure 10. For binos, as was the case for photinos [23,97], 
it is possible to adjust the sfermion masses to obtain closure density in a 
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Fig. 9. Mass contours and composition of nearly pure LSP states in the 
MSSM [93]. 




Fig. 10. Typical annihilation diagram for neutralinos through sfermion exchange. 



wide mass range. Adjusting the sfermion mixing parameters [104] or CP 
violating phases [34, 35] allows even greater freedom. 

Because of the p-wave suppression associated with Majorana fermions, 
the s-wave part of the annihilation cross-section is suppressed by the out- 
going fermion masses. This means that it is necessary to expand the cross- 
section to include p-wave corrections which can be expressed as a term 
proportional to the temperature if neutralinos are in equilibrium. Unless 
the B mass happens to lie near mzl'i or in which case there are 





280 



The Primordial Universe 



large contributions to the annihilation through direct s-channel resonance 
exchange, the dominant contribution to the BB annihilation cross section 
comes from crossed t-channel sfermion exchange. In the absence of such 
a resonance, the thermally-averaged cross section for BB — > // takes the 
generic form 



crv = 





. . ( Am\ 



= a + bx 



(1 + ...)a; 



(6.7) 



where Tl(r) is the hypercharge of /l(r), ^/ = — mj, and we have 

shown only the leading P-wave contribution proportional to x = T/m^. 
Annihilations in the early Universe continue until the annihilation rate P ~ 
avriy. drops below the expansion rate, H . For particles which annihilate 
through approximate weak scale interactions, this occurs when T ~ m^/20. 
Subsequently, the relic density of neutralinos is fixed relative to the number 
of relativistic particles. 

As noted above, the number density of neutralinos is tracked by a 
Boltzmann- like equation. 



dri c-i / \/ 9 9 \ 

— = -3-n- {av){n - n^) (6.8) 

where no is the equilibrium number density of neutralinos. By defining the 
quantity / = n/T^, we can rewrite this equation in terms of x, as 

^ = ( 6 . 9 ) 

The solution to this equation at late times (small x) yields a constant value 
of /, so that n oc T^. The final relic density expressed as a fraction of the 
critical energy density can be written as [19] 



~ 1.9 X 10"^^ 




GeV 

axf + ^bxj: 



( 6 . 10 ) 



where (T^/Ty)^ accounts for the subsequent reheating of the photon tem- 
perature with respect to y, due to the annihilations of particles with mass 
m < XfiTi^ [49]. The subscript / refers to values at freeze-out, i.e., when 
annihilations cease. 




K.A. Olive: Supersymmetry 



281 



In Figure 11 [105], regions in the M 2 — /i plane (rotated with respect 
to Fig. 9) with tan/3 = 2, and with a relic abundance 0.1 < Q.h? < 0.3 
are shaded. In Figure 11, the sfermion masses have been fixed such that 
mo = 100 GeV (the dashed curves border the region when toq = 1000 GeV). 
Glearly the MSSM offers sufficient room to solve the dark matter problem. 
In the higgsino sector Hi 2 , additional types of annihilation processes known 
as co-annihilations [106-108] between l?(i 2 ) and the next lightest neutralino 
{H[i 2 ]) must be included. These tend to significantly lower the relic abun- 
dance in much of this sector and as one can see there is little room left for 
Higgsino dark matter [105]. 







Fig. 11. Regions in the M 2 —H plane where 0.1 < Q.h^ < 0.3 [105]. Also shown 
are the Higgsino purity contours (labeled 0.1, 0.5, and 0.9). As one can see, the 
shaded region is mostly gaugino (low Higgsino purity). Masses are in GeV. 

As should be clear from Figures 9 and 11, binos are a good and likely 
choice for dark matter in the MSSM. For fixed mj, > 0.1, for all 
mg = 20—250 GeV largely independent of tan/3 and the sign of /r. In 
addition, the requirement that m^ > translates into an upper bound 
of about 250 GeV on the bino mass [93, 109]. By further adjusting the 
trilinear A and accounting for sfermion mixing this upper bound can be 
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relaxed [104] and by allowing for non-zero phases in the MSSM, the upper 
limit can be extended to about 600 GeV [34]. For fixed = 1/4, we 
would require sfermion masses of order 120—250 GeV for binos with masses 
in the range 20—250 GeV. The Higgsino relic density, on the other hand, is 
largely independent of m^. For large /i, annihilations into W and Z pairs 
dominate, while for lower /i, it is the annihilations via Higgs scalars which 
dominate. Aside from a narrow region with < mw and very massive 

Higgsinos with > 500 GeV, the relic density of H 12 is very low. Above 
about 1 TeV, these Higgsinos are also excluded. 

As discussed in Section 4, one can make a further reduction in the num- 
ber of parameters by setting all of the soft scalar masses equal at the GUT 
scale, thereby considering the GMSSM. For a given value of tan (3, the pa- 
rameter space is best described in terms of the common gaugino masses 
mi /2 and scalar masses toq at the GUT scale. In Figure 12 [35], this pa- 
rameter space is shown for tan/3 = 2. The light shaded region corresponds 
to the portion of parameter space where the relic density is between 

0.1 and 0.3. The darker shaded region corresponds to the parameters where 
the LSP is not a neutralino but rather a tr. In the mo — m. 1/2 plane, the 
upper limit to mo is due to the upper limit < 0.3. For larger mo, the 

large sfermion masses suppress the annihilation cross-section which is then 
too small to bring the relic density down to acceptable levels. In this region, 
the LSP is mostly B, and the value of /x can not be adjusted to make the 
LSP a Higgsino which would allow an enhanced annihilation particularly at 
large mo. The cosmologically interesting region at the left of the figure is 
rather jagged, due to the appearance of pole effects. There, the LSP can 
annihilate through s-channel Z and h (the light Higgs) exchange, thereby 
allowing a very large value of mo. Because the = 0.3 contour runs 

into the tr-LSP region at a given value of raxj 2 -> it was thought [32] that 
this point corresponded to an upper limit to the LSP mass (since m^ is 
approximately 0.4 mi/ 2 )- As we will see, this limit has been extended due 
to co-annihilations of the B and tr, [110]. 

The mi /2 ~ wo parameter space is further constrained by the recent 
runs at LEP. The negative results of LEP 1 searches for ^ X~^X~ 

^ XX' (where x' denotes a generic heavier neutralino) already estab- 
lished important limits on supersymmetric model parameters, but left open 
the possibility that the lightest neutralino might be massless [94]. Sub- 
sequently, the data from higher-energy LEP runs, based on chargino and 
neutralino pair production, complemented LEP 1 data and excluded the 
possibility of a massless neutralino, at least if the input gaugino masses Mq, 
were assumed to be universal [91, 111]. 

In Figure 13 [112], the constraints imposed by the LEP chargino and 
neutralino and slepton searches [96, 111] (hatched region) at LEP 2 are 
shown. The hatched regions correspond to the limits in the MSSM. 
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Fig. 12. Region in the mi/ 2 — mo plane where 0.1 < Q.h'^ < 0.3 [35]. Masses are 
in GeV. 



The thick curve to the right of this region, corresponds to the slightly more 
restrictive bound due to the assumption of universal Higgs masses at the 
GUT scale (CMSSM). The distinction between these two cases is more ap- 
parent at other values of tan (3 [91,112]. The bounds shown here correspond 
to the run at LEP at 172 GeV center of mass energy. The DO limit on 
the gluino mass [113] is also shown (dotted line). Of particular importance 
are the bounds on supersymmetric Higgs production, for which we consider 
both the e+e“ — *■ hZ and e+e” ^ hA reactions. The regions bounded 
by the lack of Higgs events are labeled nUHM corresponding to the MSSM 
(non-Universal Higgs Mass) and UHM corresponding to the UHM. Once 
again, the limits plotted in this figure (13) correspond to the 172 GeV run 
and have been greatly improved since. In this case the improvement in the 
bound when restricting the value of /i to take its GMSSM value is clear. At 
lower tan/3, the constraint curve moves quickly to the left, he., to higher 
values of rrii /2 [112]. 

The region of the rrii /2 — mo plane in which 0.1 < < 0.3 for some 

experimentally allowed value of /r < 0 is light-shaded in Figure 13, and 
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the region of the plane in which 0.1 < ^ M determined by 

the CMSSM constraint on the scalar masses is shown dark-shaded. The 
MSSM region extends to large values of toq since /r can be adjusted to take 
low values so the the LSP is Higgsino like and the relic density becomes 
insensitive to the sfermion masses. 



g < 0 tanp = 2 




Fig. 13. The domains of the mi/ 2 —mo plane (masses in GeV) that are excluded by 
the LEP 2 chargino and selection searches, both without (hatched) and with the 
assumption of Higgs scalar-mass universality for g < 0 and tan/3 = 2 [112]. Also 
displayed are the domains that are excluded by Higgs searches (solid lines) with 
and without the assumption of universal scalar masses for Higgs bosons (UHM). 
The region marked theory is excluded cosmologically because < rrix- The 
domains that have relic neutralino densities in the favored range with (dark) and 
without (light shaded) the scalar-mass universality assumption. 

As one can see from Figure 13, the combined bounds from the chargino 
and slepton searches, provide a lower bound to mi /2 which can be translated 
into a bound on the neutralino mass. In the region of interest, we have the 
approximate relation that « 0.4 toi/2- 

Ultimately, the Higgs mass bound alone does not provide an independent 
bound on mi/ 2 - The reason is that the Higgs constraint curves bend to 
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the left at large mo, where large sfermion masses lead to greater positive 
radiative corrections to the Higgs mass, and the Higgs curve strikes the 
chargino bound at sufficiently large mo- However, when combined with 
cosmological limit on the relic density, a stronger constraint can be found. 
Recall that at large toq, the cosmological bound is satisfied by lowering 
/i. At certain values of tan/3 and mi/ 2 , one can not lower and remain 
consistent with the relic density limit and the Higgs mass limit. This is seen 
in Figure 13 by the short horizontal extension of the nlJHM Higgs curve. 
At lower values of tan /3 this extension is lengthened. In the UHM case, 
once again the UHM Higgs curve bends to the right at large mo- However, 
cosmology excludes the large toq region in the CMSSM (CMSSM and UHM 
are treated synonymously here). In this case the mass bound on mi /2 or 
is much stronger. 

The results from the 172 GeV run at LEP provided a bound on the 
chargino mass of > 86 GeV and corresponding bound on the neutralino 
mass of > 40 GeV [112]. As noted above, the UHM Higgs mass bound 
becomes very strong at low tan /3. In fact, for tan /3 < 1.7, the UHM Higgs 
curve moves so far to the left so as to exclude the entire dark shaded region 
required by cosmology. Subsequent to the 183 GeV run at LEP [114], the 
chargino mass bound was pushed to >91 GeV, and a Higgs mass 
bound was established to be Wh ^ 86 GeV for tan/3 < 3, and ruh ^ 76 
GeV for tan /3 > 3. These limits were further improved by the 189 GeV 
run, so that at the kinematic limit the chargino mass bound is > 95 
GeV, implying that mi /2 ^110 GeV and the neutralino mass limit becomes 
> 50 Gev. The Higgs mass bound is now (as of the 189 run) Wh ^ 95 
GeV for low tan/3 [115]. 

In the MSSM, in the region of the M 2 —H plane, where 37(i2) is the 
LSP (that is, at large ^ and very large M 2 ), the next lightest neutralino 
or NLSP, is the state and the two are nearly degenerate with masses 
close to fjL. In this case, additional annihilation channels (or co-annihilation) 
which involve both a H(^ 2 ) ^^nd a H\^^ 2 ] in th® initial state become impor- 
tant [106,107]. The enhanced annihilation of Higgsinos lowers the relic 
density substantially and virtually eliminates the Higgsino as a viable dark 
matter candidate. In the GMSSM, co-annihilation is also important, now 
at large values of mi /2 [HO]. Recall, that previously we discussed a pos- 
sible upper limit to the mass of the neutralino in the GMSSM, where the 
cosmologically allowed region of Figure 12 runs into the region where the 
tr is the LSP. Along the boundary of this region, the neutralino B, and the 
tr are degenerate. Therefore close to this boundary, new co-annihilation 
channels involving both B and tr (as well as the other right-handed slep- 
tons which are also close in mass) in the initial state become important. As 
in the MSSM, the co-annihilations reduce the relic density and as can be 
seen in Figures 14 the upper limit to the neutralino is greatly relaxed [110]. 
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Fig. 14. The light-shaded area is the cosmologically preferred region with 
0.1 < < 0.3. The light dashed lines show the location of the cosmologically 

preferred region if one ignores coannihilations with the light sleptons. In the dark 
shaded regions in the bottom right of each panel, the LSP is the tr, leading to 
an unacceptable abundance of charged dark matter. Also shown are the isomass 
contours rn^± = 95 GeV and mh = 95, 100, 105, 110 GeV, as well as an indica- 
tion of the slepton bound from LEP [115]. These figures are adapted from those 
in [110]. 



As one can plainly see, the cosmologically allowed region is bent way from 
the tr-LSP region extending to higher values of TOi/ 2 - Also shown in this 
set of figures are the iso-mass contours for charginos, sleptons and Higgses, 
so that the LEP limits discussed above can be applied. In these figures, one 
can see the sensitivity of the Higgs mass with tan ( 3 . 

Despite the importance of the coannihilation, there is still an upper limit 
to the neutralino mass. If we look at an extended version of Figures 14, as 
shown in Figures 15, we see that eventually, the cosmologically allowed 
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region intersects the tr-LSP region. The new upper limit occurs at Wi /2 ~ 
1450 GeV, implying that < 600 GeV. 





Fig. 15. The same as Figures 14, for /i < 0, but extended to larger mi/ 2 - These 
figures are adapted from those in [110]. 

As noted above, the Higgs mass bounds can be used to exclude low values 
of tan /3. However, when co-annihilations are included, the limits on tan [3 
is weakened. In the mi /2 — itio plane, the Higgs iso-mass contour appears 
nearly vertical for low values of toq. A given contour is highly dependent 
on tan P, and moves to the right quickly as tan /3 decreases. This behavior 
is demonstrated in the series of Figures 16, which show the positions of the 
Higgs mass contours for tan P = 2, 2.5, and 3, for both positive and negative 
fx. If we concentrate, on the 95 GeV contour, we see that at tan P = 3, the 
bulk of the cosmological region is allowed. At tan P = 2.5, much of the 
bulk is excluded, though the trunck region is allowed. At the lower value 
of tan P = 2, the 95 GeV contour is off the scale of this figure. A thorough 
examination [110] yields a limit tan/3 > 2.2 for /i < 0. For positive /r, the 
results are qualitatively similar. The Higgs mass contours are farther to 
the left (relative to the negative /r case), and the limit on tan /3 is weaker. 
Nevertheless the limit is tan/3 > 1.8 for ^ > 0. 

As the runs at LEP are winding down, the prospects for improving the 
current limits or more importantly discovering supersymmetry are dimin- 
ishing. Further progress will occur in the future with Run II at the Tevetron 
and ultimately at the LHG. Gurrently, while we have strong and interesting 
limits on the MSSM and GMSSM parameter space for LEP, much of the 
phenomenological and cosmological parameter space remains viable. 
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Fig. 16. As in 14 for tan (3 = 2, 2.5, and 3, for both /i > 0 and /i < 0. Higgs mass 
contours of 86, 95, and 96 GeV, are displayed to show the dependence on tan (3. 
I thank Toby Falk for providing these fignres. 










K.A. Olive: Supersymmetry 



289 



J. McDonald, H. Murayama, D.V. Nanopoulos, S.J. Rey, M. Schmitt, M. 
Srednicki, K. Tamvakis, R. Watkins for many enjoyable collaborations which 
have been summarized in these lectures. I would also like to thank J. Ellis, 
T. Falk, S. Lee, and M. Voloshin for assistance in the preparation of these 
lecture notes. This work was supported in part by DOE grant DE-FG02- 
94ER40823 at Minnesota. 

References 

[1] Y.A. Gol’fand and E.P. Likhtman, Pis’ma Zh. Eksp. Tear. Fiz. 13 (1971) 323; P. 
Ramond, Phys. Rev. D 3 (1971) 2415; A. Neveu and J.H. Schwarz, Phys. Rev. D 4 
(1971) 1109; D.V. Volkov and V.P. Akulov, Phys. Lett. 46B (1973) 109. 

[2] J. Wess and B. Zumino, Nucl. Phys. B 70 (1974) 39. 

[3] L. Maiani, Proc. Summer School on Particle Physics, Gif-sur- Yvette, 1979 (IN2P3, 
Paris, 1980); G’t Hooft, Recent Developments in Field Theories, edited by G’t Hooft 
et al. (Plenum Press, New York, 1980) p. 3; E. Witten, Nucl. Phys. B 188 (1981) 
513; R.K. Kaul, Phys. Lett. 109B (1982) 19. 

[4] J. Ellis, S. Kelley and D.V. Nanopoulos, Phys. Lett. B 249 (1990) 441; J. Ellis, S. 
Kelley and D.V. Nanopoulos, Phys. Lett. B 260 (1991) 131; U. Amaldi, W. de Boer 
and H. Furstenau, Phys. Lett. B 260 (1991) 447; P. Langacker and M. Luo, Phys. 
Rev. D 44 (1991) 817. 

[5] J.D. Bjorken and S.D. Drell, Relativistic Quantum Mechanics (McGraw Hill, New 
York, 1964). 

[6] P. Fayet and S. Ferrara, Phys. Rep. 32 (1977) 251. 

[7] J. Wess and J. Bagger, Supersymmetry and Supergravity (Princeton University 
Press, Princeton NJ, 1992). 

[8] G.G. Ross, Grand Unified Theories (Addison- Wesley, Redwood Gity CA, 1985). 

[9] S. Martin [hep-ph/9709356]. 

[10] J. Ellis [hep-ph/9812235]. 

[11] S. Coleman and J. Mandula, Phys. Rev. 159 (1967) 1251. 

[12] R. Haag, J. Lopuszanski and M. Sohnius, Nucl. Phys. B 88 (1975) 257. 

[13] L. Girardello and M.T. Grisaru, Nucl. Phys. B 194 (1982) 65. 

[14] P. Fayet, Phys. Lett. B 64 (1976) 159; Phys. Lett. B 69 (1977) 489; Phys. Lett. B 
84 (1979) 416. 

[15] H.E. Haber and G.L. Kane, Phys. Rep. 117 (1985) 75. 

[16] Y. Okada, M. Yamaguchi and T. Yanagida, Progr. Theor. Phys. 85 (1991) 1; J. Ellis, 
G. Ridolfi and F. Zwirner, Phys. Lett. B 257 (1991) 83; Phys. Lett. B 262 (1991) 
477; H.E. Haber and R. Hempfiing, Phys. Rev. Lett. 66 (1991) 1815; R. Barbieri, M. 
Frigeni and F. Caravaglios, Phys. Lett. B 258 (1991) 167; Y. Okada, M. Yamaguchi 
and T. Yanagida, Phys. Lett. B 262 (1991) 54; H.E. Haber [hep-ph/9601330]. 

[17] M. Carena, M. Quiros and C.E.M. Wagner, Nucl. Phys. B 461 (1996) 407; H.E. 
Haber, R. Hempfiing and A.H. Hoang, Zeit. Phys. C 75 (1997) 539. 

[18] J. Ellis and S. Rudaz, Phys. Lett. 128B (1983) 248. 

[19] J. Ellis, J.S. Hagelin, D.V. Nanopoulos, K.A. Olive and M. Srednicki, Nucl. Phys. 
B 238 (1984) 453. 

[20] P. Fayet and J. Iliopoulos, Phys. Lett. 51B (1974) 461. 

[21] L. O’Raifeartaigh, Nucl. Phys. B 96 (1975) 331. 

[22] G.R. Farrar and P. Fayet, Phys. Lett. B 76 (1978) 575. 




290 



The Primordial Universe 



[23] H. Goldberg, Phys. Rev. Lett. 50 (1983) 1419. 

[24] J. Ellis and D.V. Nanopoulos, Phys. Lett. IlOB (1982) 44. 

[25] S. Dimopoulos and H. Georgi, Nucl. Phys. B 193 (1981) 150. 

[26] K. Inoue, A. Kakuto, H. Komatsu and S. Takeshita, Prog. Theor. Phys. 68 (1982) 
927; 71 (1984) 413. 

[27] S. Dimopoulos and H. Georgi, Nucl. Phys. B 193 (1981) 50; S. Dimopoulos, S. Raby 
and F. Wilczek, Phys. Rev. D 24 (1981) 1681; L. Ibanez and G.G. Ross, Phys. Lett. 
105B (1981) 439. 

[28] M. Drees and M.M. Nojiri, Nucl. Phys. B 369 (1992) 54; W. de Boer 
[hep-ph/9402266]. 

[29] S.P. Martin and M.T. Vaughn, Phys. Rev. D 50 (1994) 2282; Y. Yamada, Phys. 
Rev. D 50 (1994) 3537; I. Jack and D.R.T. Jones, Phys. Lett. B 333 (1994) 372; 
I. Jack, D.R.T. Jones, S.P. Martin, M.T. Vaughn and Y. Yamada, Phys. Rev. D 50 
(1994) 5481; P.M. Ferreira, I. Jack and D.R.T. Jones, Phys. Lett. B 387 (1996) 80; 

I. Jack, D.R.T. Jones and A. Pickering, Phys. Lett. B 432 (1998) 114. 

[30] A. Salam and J. Strathdee, Phys. Rev. D 11 (1975) 1521; M.T. Grisaru, W. Siegel 
and M. Rocek, Nucl. Phys. B 159 (1979) 429. 

[31] L.E. Ibanez and G.G. Ross, Phys. Lett. B 110 (1982) 215; L.E. Ibanez, Phys. Lett. 
B 118 (1982) 73; J. Ellis, D.V. Nanopoulos and K. Tamvakis, Phys. Lett. B 121 
(1983) 123; J. Ellis, J. Hagelin, D.V. Nanopoulos and K. Tamvakis, Phys. Lett. B 
125 (1983) 275; L. Alvarez-Gaume, J. Polchinski and M. Wise, Nucl. Phys. B 221 
(1983) 495. 

[32] G. Kane, C. Kolda, L. Roszkowski and J. Wells, Phys. Rev. D 49 (1994) 6173. 

[33] M. Dugan, B. Grinstein and L. Hall, Nucl. Phys. B 255 (1985) 413; R. Arnowitt, 

J. L. Lopez and D.V. Nanopoulos, Phys. Rev. D 42 (1990) 2423; R. Arnowitt, M.J. 
Duff and K.S. Stelle, Phys. Rev. D 43 (1991) 3085; Y. Kizukuri and N. Oshimo, 
Phys. Rev. D 45 (1992) 1806; D 46 (1992) 3025; T. Ibrahim and P. Nath, Phys. 
Lett. B418 (1998) 98; Phys. Rev. D 57 (1998) 478; D 58 (1998) 111301; M. Brhlik, 
G.J. Good and G.L. Kane, Phys. Rev. D 59 (1999) 115004; A. Bartl, T. Gajdosik, 
W. Porod, P. Stockinger and H. Stremnitzer, Phys. Rev. D 60 (1999) 073003; T. 
Falk, K.A. Olive, M. Pospelov and R. Roiban, Nucl. Phys. B 560 (1999) 3. 

[34] T. Falk, K.A. Olive and M. Srednicki, Phys. Lett. B 354 (1995) 99. 

[35] T. Falk and K.A. Olive, Phys. Lett. B 375 (1996) 196; B 439 (1998) 71. 

[36] D.Z. Freedman, P. Van Nieuwenhuizen and S. Ferrara, Phys. Rev. D 13 (1976) 3214; 
S. Deser and B. Zumino, Phys. Lett. 62B (1976) 335; D.Z. Freedman and P. van 
Nieuwenhuizen, Phys. Rev. D 14 (1976) 912; see also P. Van Nieuwenhuizen, Phys. 
Rep. 68C (1981) 189. 

[37] E. Gremmer, B. Julia, J. Scherk, S. Ferrara, L. Girardello and P. Van Nieuwenhuizen, 
Phys. Lett. 79B (1978) 231; Nucl. Phys. B 147 (1979) 105; E. Gremmer, S. Ferrara, 
L. Girardello and A. Van Proeyen, Phys. Lett. 116B (1982) 231; and Nucl. Phys. B 
212 (1983) 413; R. Arnowitt, A.H. Chamseddine and P. Nath, Phys. Rev. Lett. 49 
(1982) 970; 50 (1983) 232; Phys. Lett. 121B (1983) 33; J. Bagger and E. Witten, 
Phys. Lett. 115B (1982) 202; 118B (1982) 103; J. Bagger, Nucl. Phys. B211 (1983) 
302. 

[38] J. Polonyi, Budapest preprint KFKI-1977-93 (1977). 

[39] D.V. Volkov and V.A. Soroka, JETP Lett. 18 (1973) 312; S. Deser and B. Zumino, 
Phys. Rev. Lett. 38 (1977) 1433. 

[40] R. Barbieri, S. Ferrara and G.A. Savoy, Phys. Lett. 119B (1982) 343. 

[41] H.-P. Nilles, M. Srednicki and D. Wyler, Phys. Lett. 120B (1983) 345; L.J. Hall, J. 
Lykken and S. Weinberg, Phys. Rev. D 27 (1983) 2359. 




K.A. Olive: Supersymmetry 



291 



[42] E. Cremmer, S. Ferrara, C. Kounnas and D.V. Nanopoulos, Phys. Lett. 133B (1983) 
61; J. Ellis, C. Kounnas and D.V. Nanopoulos, Nucl. Phys. B 241 (1984) 429; J. 
Ellis, A.B. Lahanas, D.V. Nanopoulos and K. Tamvakis, Phys. Lett. 134B (1984) 
429. 

[43] E. Witten, Phys. Lett. 155B (1985) 151; S. Ferrara, C. Kounnas and M. Porrati, 
Phys. Lett. 181B (1986) 263; L.J. Dixon, V.S. Kaplunovsky and J. Louis, Nucl. 
Phys. B 329 (1990) 27; S. Ferrara, D. Liist and S. Tlieisen, Phys. Lett. 233B (1989) 
147. 

[44] J. Ellis, C. Kounnas and D.V. Nanopoulos, Nucl. Phys. B 247 (1984) 373; for a 
review see: A.B. Lahanas and D.V. Nanopoulos, Phys. Rep. 145 (1987) 1. 

[45] J.D. Breit, B. Ovrut and G. Segre, Phys. Lett. 162B (1985) 303; P. Binetruy and 
M.K. Gaillard, Phys. Lett. 168B (1986) 347; P. Binetruy, S. Dawson M.K. Gaillard 
and I. Hinchliffe, Phys. Rev. D 37 (1988) 2633 and references therein. 

[46] G.D. Goughian, W. Fischler, E.W. Kolb, S. Raby and G.G. Ross, Phys. Lett. 131B 
(1983) 59. 

[47] E.W. Kolb and R. Scherrer, Phys. Rev. D 25 (1982) 1481. 

[48] S. Weinberg, Phys. Rev. Lett. 48 (1982) 1303. 

[49] G. Steigman, K.A. Olive and D.N. Schramm, Phys. Rev. Lett. 43 (1979) 239; K.A. 
Olive, D.N. Schramm and G. Steigman, Nucl. Phys. B 180 (1981) 497. 

[50] J. Ellis, A.D. Linde and D.V. Nanopoulos, Phys. Lett. 118B (1982) 59. 

[51] J. Ellis, J.E. Kim and D.V. Nanopoulos, Phys. Lett. 145B (1984) 181; J. Ellis, D.V. 
Nanopoulos and S. Sarkar, Nucl. Phys. B 259 (1985) 175; R. Juszkiewicz, J. Silk 
and A. Stebbins, Phys. Lett. 158B (1985) 463; D. Lindley, Phys. Lett. B 171 (1986) 
235; M. Kawasaki and K. Sato, Phys. Lett. B 189 (1987) 23. 

[52] J. Ellis, D.V. Nanopoulos, K.A. Olive and S.J. Rey, Astropart. Phys. 4 (1996) 371. 

[53] M. Kawasaki and T. Moroi, Prog. Theor. Phys. 93 (1995) 879. 

[54] A.D. Linde, Particle Physics and Inflationary Cosmology (Harwood, 1990); K.A. 
Olive, Phys. Rep. 190 (1990) 181; D. Lyth and A. Riotto, Phys. Rep. 314 (1999) 1. 

[55] J. Ellis, D.V. Nanopoulos, K.A. Olive and K. Tamvakis, Nucl. Phys. B 221 (1983) 
224; Phys. Lett. 118B (1982) 335. 

[56] A.D. Linde, Phys. Lett. 108B (1982) 389; A. Albrecht and P.J. Steinhardt, Phys. 
Rev. Lett. 48 (1982) 1220. 

[57] W.H. Press, Phys. Scr. 21 (1980) 702; V.F. Mukhanov and G.V. Chibisov, JETP 
Lett. 33 (1981) 532; S.W. Hawking, Phys. Lett. 115B (1982) 295; A. A. Starobinsky, 
Phys. Lett. 117B (1982) 175; A.H. Guth and S.Y. Pi, Phys. Rev. Lett. 49 (1982) 
1110; J.M. Bardeen, P.J. Steinhardt and M.S. Turner, Phys. Rev. D 28 (1983) 679. 

[58] A.D. Linde, Phys. Lett. 116B (1982) 335. 

[59] J. Ellis, D.V. Nanopoulos, K.A. Olive and K. Tamvakis, Phys. Lett. 120B (1983) 
334. 

[60] D.V. Nanopoulos, K.A. Olive, M. Srednicki and K. Tamvakis, Phys. Lett. 123B 
(1983) 41. 

[61] R. Holman, P. Ramond and G.G. Ross, Phys. Lett. 137B (1984) 343. 

[62] J. Ellis, K. Enquist, D.V. Nanopoulos, K.A. Olive and M. Srednicki, Phys. Lett. 
152B (1985) 175. 

[63] P. Binetruy and M.K. Gaillard, Phys. Rev. D 34 (1986) 3069. 

[64] G.F. Smoot et al., ApJ 396 (1992) LI; E.L. Wright et al., ApJ 396 (1992) L13. 

[65] B. Campbell, S. Davidson and K.A. Olive, Nucl. Phys. B 399 (1993) 111. 

[66] J. Ellis, K. Enqvist, D.V. Nanopoulos and K.A. Olive, Phys. Lett. B 191 (1987) 
343. 

[67] A.D. Sakharov, JETP Lett. 5 (1967) 24. 




292 



The Primordial Universe 



[68] S. Weinberg, Phys. Rev. Lett. 42 (1979) 850; D. Toussaint, S.B. Treiman, F. Wilczek 
and A. Zee, Phys. Rev. D 19 (1979) 1036. 

[69] A.D. Dolgov and A.D. Linde, Phys. Lett. B 116 (1982) 329; D.V. Nanopoulos, K.A. 
Olive and M. Srednicki, Phys. Lett. B 127 (1983) 30. 

[70] N. Sakai and T. Yanagida, Nucl. Phys. B 197 (1982) 533; S. Weinberg, Phys. Rev. 
D 26 (1982) 287. 

[71] J. Ellis, D.V. Nanopoulos and S. Rudaz, Nucl. Phys. B 202 (1982) 43; S. 
Dimopoulos, S. Raby and F. Wilczek, Phys. Lett. 112B (1982) 133. 

[72] D.V. Nanopoulos and K. Tamvakis, Phys. Lett. B 114 (1982) 235. 

[73] I. Affleck and M. Dine, Nucl. Phys. B 249 (1985) 361. 

[74] A.D. Linde, Phys. Lett. B 160 (1985) 243. 

[75] M. Dine, L. Randall and S. Thomas, Phys. Rev. Lett. 75 (1995) 398; Nucl. Phys. B 
458 (1996) 291. 

[76] P. Binetruy and M.K. Gaillard, Phys. Lett. B 195 (1987) 382. 

[77] M.K. Gaillard, H. Murayama and K.A. Olive, Phys. Lett. B 355 (1995) 71. 

[78] M.K. Gaillard and V. Jain, Phys. Rev. D 49 (1994) 1951; M.K. Gaillard, V. Jain 
and K. Saririan, Phys. Lett. B 387 (1996) 520; Phys. Rev. D 55 (1997) 833. 

[79] J. Ellis, D.V. Nanopoulos and K.A. Olive, Phys. Lett. B 184 (1987) 37. 

[80] B.A. Campbell, M.K. Gaillard, H. Murayama and K.A. Olive, Nucl. Phys. B 538 
(1999) 351. 

[81] V.A. Kuzmin, V.A. Rubakov and M.E. Shaposhnikov, Phys. Lett. 191B (1987) 171; 
see also, H. Dreiner and G. Ross, Nucl. Phys. B 410 (1993) 188; S. Davidson, K. 
Kainulainen and K.A. Olive, Phys. Lett. B 335 (1994) 339. 

[82] see: J.R. Primack, in Enrico Fermi. Course 92, edited by N. Cabibbo (North 
Holland, Amsterdam, 1987) p. 137; V. Trimble, ARA&Z.A 25 (1987) 425; J. 
Primack, D. Seckel and B. Sadulet, Ann. Rev. Nucl. Part. Sci. 38 (1988) 751; 
Dark Matter, edited by M. Srednicki (North-Holland, Amsterdam, 1989). 

[83] K.A. Olive, G. Steigman and T.P. Walker, Phys. Rep. (in press) [astro-ph/9905320]. 

[84] D.J. Hegyi and K.A. Olive, Phys. Lett. 126B (1983) 28; ApJ 303 (1986) 56. 

[85] S. Perlmutter et al., Nat 391 (1998) 51; A.G. Riess et al., AJ 116 (1998) 1009. 

[86] J. Rich, M. Spiro and J. Lloyd-Owen, Phys. Rep. 151 (1987) 239; P.F. Smith, 
Contemp. Phys. 29 (1998) 159; T.K. Hemmick et al, Phys. Rev. D 41 (1990) 2074. 

[87] L.E. Ibanez, Phys. Lett. 137B (1984) 160; J. Hagelin, G.L. Kane and S. Raby, Nucl. 
Phys. B 241 (1984) 638; T. Ealk, K.A. Olive and M. Srednicki, Phys. Lett. B 339 
(1994) 248. 

[88] S. Ahlen et al, Phys. Lett. B 195 (1987) 603; D.D. Caldwell et al, Phys. Rev. Lett. 
61 (1988) 510; M. Beck et al, Phys. Lett. B 336 (1994) 141. 

[89] see e.g. K.A. Olive and M. Srednicki, Phys. Lett. 205B (1988) 553. 

[90] The LEP Collaborations ALEPH, DELPHI, L3, OPAL and the LEP Electroweak 
Working Group, CERN preprint PPE/95-172 (1995). 

[91] J. Ellis, T. Ealk, K. Olive and M. Schmitt, Phys. Lett. B 388 (1996) 97. 

[92] H.V. Klapdor-Kleingrothaus and Y. Ramachers, Eur. Phys. J. A S (1998) 85. 

[93] K.A. Olive and M. Srednicki, Phys. Lett. B 230 (1989) 78; Nucl. Phys. B 355 (1991) 
208. 

[94] ALEPH collaboration, D. Decamp et al, Phys. Rep. 216 (1992) 253; L3 collabo- 
ration, M. Acciarri et al, Phys. Lett. B 350 (1995) 109; OPAL collaboration, G. 
Alexander et al, Phys. Lett. B 377 (1996) 273. 

[95] ALEPH collaboration, ALEPH Collaboration, D. Buskulic et al, Phys. Lett. B 373 
(1996) 246; OPAL Collaboration, G. Alexander et al, Phys. Lett. B 377 (1996) 
181; L3 Collaboration, M. Acciarri et al, Phys. Lett. B 377 (1996) 289; DELPHI 
Collaboration, P. Abreu et al, Phys. Lett. B 382 (1996) 323. 




K.A. Olive: Supersymmetry 



293 



[96] ALEPH collaboration, R. Barate et al.^ Phys. Lett. B 412 (1997) 173; Eur. Phys. 
J. C 2 (1998) 417; DELPHI collaboration, P. Abreu, Eur. Phys. J. C 1 (1998) 1; 
Eur. Phys. J. C 2 (1998) 1; L3 collaboration, M. Acciarri et al., Phys. Lett. B 411 
(1997) 373; Eur. Phys. J. C 4 (1998) 207; OPAL collaboration, K. Ackerstaff et al., 
Eur. Phys. J. Cl (1998) 425; Eur. Phys. J. C 2 (1998) 213. 

[97] L.M. Krauss, Nucl. Phys. B 227 (1983) 556. 

[98] R. Watkins, M. Srednicki and K.A. Olive, Nucl. Phys. B 310 (1988) 693. 

[99] P. Hut, Phys. Lett. 69B (1977) 85; B.W. Lee and S. Weinberg, Phys. Rev. Lett. 39 
(1977) 165. 

[100] J. McDonald, K.A. Olive and M. Srednicki, Phys. Lett. B 283 (1992) 80. 

[101] M. Drees and M.M. Nojiri, Phys. Rev. D 47 (1993) 376. 

[102] G. Jungman, M. Kamionkowski and K. Griest, Phys. Rev. 267 (1996) 195. 

[103] H. Baer and M. Brlilik, Phys. Rev. D 53 (1996) 597. 

[104] T. Falk, R. Madden, K.A. Olive and M. Srednicki, Phys. Lett. B 318 (1993) 354. 

[105] J. Ellis, T. Falk, G. Ganis, K.A. Olive and M. Schmitt, Phys. Rev. D 58 (1998) 

095002. 

[106] K. Griest and D. Seckel, Phys. Rev. D 43 (1991) 3191. 

[107] S. Mizuta and M. Yamaguchi, Phys. Lett. B 298 (1993) 120. 

[108] M. Drees, M.M. Nojiri, D.P. Roy and Y. Yamada, Phys. Rev. D 56 (1997) 276. 

[109] K. Greist, M. Kamionkowski and M.S. Turner, Phys. Rev. D 41 (1990) 3565. 

[110] J. Ellis, T. Falk and K. Olive, Phys. Lett. B 444 (1998) 367; J. Ellis, T. Falk, K. 
Olive and M. Srednicki, Astr. Part. Phys. 13 (2000) 181. 

[111] ALEPH Collaboration, D. Buskulic et al., Z. Phys. C 72 (1996) 549. 

[112] J. Ellis, T. Falk, K.A. Olive and M. Schmitt, Phys. Lett. B 413 (1997) 355. 

[113] DO collaboration, S. Abachi et al., Phys. Rev. Lett. 75 (1995) 618; CDF collab- 
oration, F. Abe et al., Phys. Rev. Lett. 76 (1996) 2006; Phys. Rev. D 56 (1997) 

1357. 

[114] DELPHI collaboration, P. Abreu, Phys. Lett. B 446 (1999) 75; OPAL collabora- 
tion, G. Abbiendi et al., Eur. Phys. J. C S (1999) 255; see also: The LEP Working 
Group for Higgs Boson Searches, ALEPH, DELPHI, L3 and OPAL, CERN-EP/99- 
060. 

[115] Recent official compilations of LEP limits on supersymmetric particles are available 
from: http://www.cern.ch/LEPSUSY/. 




COURSE 6 



DARK MATTER: DIRECT DETECTION 



G. CHARDIN 

DSM/DAPNIA/SPP, Centre d’ Etudes 
de Saclay, 91191 Gif-sur-Yvette 
Cedex, France 






Contents 



1 Motivations for non-baryonic Dark Matter 297 

2 Weakly Interacting Massive Particles (WIMPs) 301 

2.1 Phenomenology 301 

2.2 Experimental signatures 303 

2.3 WIMP direct detection experiments without discrimination .... 304 

2.4 WIMP direct detection experiments with discrimination 310 

2.5 Other discrimination techniques 319 

2.6 Detecting the recoil direction 320 

2.7 Low-energy WIMPs trapped in the Solar System 322 

2.8 WIMPs with dominant axial interactions. Direct vs. indirect de- 
tection 323 

2.9 Testing (a signihcant part of) the SUSY models 324 

2.10 Conclusions 330 

3 Axions 330 

4 Conclusions and perspectives 334 




DARK MATTER: DIRECT DETECTION 



G. Chardin 



1 Motivations for non-baryonic Dark Matter 

Determining the precise nature of Dark Matter is one of the main open 
questions of contemporary physics. Its resolution will probably entail a ma- 
jor Copernician revolution since baryonic matter, which is constituting our 
environment, probably represents only a small fraction of the total energy 
content in the Universe. Since the thirties, when the Dark Matter ques- 
tion was initially raised by Zwicky [1] , and despite impressive experimental 
progress and effort over the last ten years [2] , the precise nature of this Dark 
Matter has not yet been uncovered. 

Over the last few years, several candidates have been proposed to solve 
the Dark Matter enigma. These include anomalous gravity [3], Massive 
Compact Halo Objects (MACHOs) in our galaxy [4, 5], massive neutri- 
nos [6] . The recent observation of an accelerating universe [7, 8] has created 
a considerable surprise, leading to the creation of a new version of ether, 
quintessence [9]. With this variety of candidates, it would seem presump- 
tuous to estimate that the solution is close at hand. 

Still, the precision in the determination of the cosmological parameters 
has increased. In particular, the value of the Hubble expansion parameter is 
now reasonably converging towards a value of ft- « 65 ± 10 km/s/Mpc [10]. 
Also, a non-zero cosmological constant term [7, 8] and a revised method 
estimating the ages of globular clusters [11] have considerably reduced the 
age problem of the universe, now estimated to be « 13 ± 2 Gyr [10]. In 
addition, it appears that the Dark Matter problem is present at all scales, 
from galaxies to superclusters, with an increasing contribution of the Dark 
Matter component to the matter density pm as we move to larger structures. 

Locally, an impressive amount of data has been collected by studying the 
rotation curves of galaxies which, apart from a few exotic galaxies, always 
seem to require halos extending much beyond the extent of luminous matter. 
Already at this galactic scale, typically > 80% of the matter is dark and, 
despite extensive efforts, no conventional counterpart has yet been found. 
For example, after the initial excitement created by the observation of a 

© EDP Sciences, Springer- Verlag 2001 
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Fig. 1. Exclusion diagram (95% C.L.) on the halo mass fraction in the form of 
compact objects of mass M, for the standard halo model (4 x 10^^ Solar masses 
inside 50 kpc), from all LMC and SMC EROS data 1990-98. The solid line is 
the limit inferred from the three LMC microlensing candidates; the dashed line 
includes in addition the SMC candidate. The MACHO 95% C.L. accepted region 
(Alcock et al. [12, 13]) is the hatched area, with the preferred value indicated by 
the cross (from Lassere et al. [14]). 



small set of compact halo objects, detected by their microlensing effects 
on very large samples of stars [4, 5] , more precise measurements have re- 
vealed that these MACHOs represented less than typically 10 to 20% of the 
halo content [12-14]. The present experimental situation of the MACHO 
searches is summarized in Figure 1 where the limits reached by the EROS 
experiment [14] are now excluding most of the mass range of MACHOs 
as contributing a significant part to the galactic Dark Matter. Similarly, 
although observations using the Hubble Deep Field Space Telescope were 
initially interpreted to demonstrate that white dwarf stars represented 50% 
of the galactic Dark Matter [15], more recent data indicate that this propor- 
tion is in fact < 10% [16,17]. The attention is now shifting towards ionized 
gas which might have been observed in recent HST observations [18] . 

But the Dark Matter problem is even more glaring at larger scales where 
the proper motions of galaxies in clusters, the study of gravitational velocity 
fields, the X-ray emission temperature in clusters and the lensing methods 
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all converge to indicate that the matter content at the supercluster scale: 

= PmIPc is of the order of 0.3 when expressed as a function of the critical 
density pc- Pc = {3 Hq/8ttG) « 1.88 x 10“^® h'^ kgm“^, where Hq is the 
Hubble parameter and G is the gravitational constant. 

It might seem disturbing that the amount of Dark Matter observed at 
the supercluster scale is significantly larger than at the galactic scale but 
the possible fractal geometry of the universe, and its lack of homogeneity 
except possibly at very large scales, has been gradually recognized as an 
element of description of the Universe [19]. The considerable Dark Matter 
densities observed at very large scales appear to imply that a large part of 
this hidden matter must be non-baryonic. This statement is based on the 
success of homogeneous nucleosynthesis [20] and its ability to predict the 
light element abundances, and notably that of helium and deuterium. 

Today, the baryonic content density, expressed in terms of a fraction of 
the critical density, has narrowed to: 

Ub = 0.04±0.006(65/i?o)^ 

where the uncertainty on the Hubble parameter Hq « 65 km/s/Mpc is now 
of the order of 15%, or possibly to: 

Ub = 0.06±0.01(65/i7o)^ 

if the quasar data [21] are interpreted as evidence for a low deuterium to 
hydrogen abundance ratio and not to a second absorption in the Lyman 
forest. 

Therefore, if we accept the observational evidence at large scales of a 
value of « 0.3, we are also led to accept that, on the one hand, a sig- 
nificant part of the baryons are hidden and, secondly, that as much as 90% 
of the matter content of the Universe is made of non-baryonic matter. The 
requirement of a large non-baryonic dark matter component appears also 
substantiated by the Cosmic Microwave Background (CMB) data analysis. 
COBE had been the first experiment [22] to measure unambiguously the pri- 
mordial fluctuation spectrum at large angular scale. More recently, several 
experiments have measured with increasing precision the power spectrum 
at smaller angular scales. Assuming the consistency of these experiments, 
the view of an Universe with a value of (Um -|- Ua) close to 1 is now emerg- 
ing with more precision. In particular, in recent balloon measurements, 
BOOMERANG and MAXIMA [23] have considerably increased the confi- 
dence in such a flat universe. More precise analysis of these experiments, 
together with the future MAP, ARCHEOPS and PLANCK experiments, 
will allow to pinpoint the value of (Om -|- Oa). 

Only three years ago, most of the observers would not have included a 
cosmological A term, assuming that it would be zero. As noted previously. 
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Fig. 2. Observational constraints on and f^A- The region compatible (95% 
C.L.) with the BOOMERANG data, elongated along the flo line, is marked by 
large black dots. The region marked with small dots is consistent (95% C.L.) with 
the high-redshift supernovae surveys (from De Bernardis et al. [23]). 



a major surprise in the search for Dark Matter has been the recent obser- 
vation by two groups [7, 8] of an accelerating universe at large distances. 
By measuring the luminosities of a few tens of type la supernovae, which 
appear to be decent standard candle candidates after adjustment of their 
light curve by a phenomenological “stretch” factor, the (Om — Oa) combi- 
nation is determined. Combined with the BOOMERANG and MAXIMA 
CMB measurements (Fig. 2), these observations strongly favor a universe 
with a non-zero value of Lambda, with preferred values: 

Om « 1/3, Oa « 2/3. 

Therefore, to remain compatible with the nucleosynthesis constraints, the 
CMB and SNla analysis impose that a significant fraction of non-baryonic 
dark matter is present. This need is further confirmed by the structure 
formation scenarios, which appear to require a significant fraction of some 
cold dark matter component to generate the distribution of structures ob- 
served at the present epoch. Since the preferred structure scenarios also 
incorporate some fraction of Hot Dark Matter, which could be constituted 
by neutrinos, we are presently faced with an uncomfortable multiplicity of 
Dark Matter components (hidden baryons, cold dark matter, hot dark mat- 
ter and a large quintessential component). This is reminiscent of the physics 
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situation at the turn of the previous century, faced with the enigma of the 
Mercury perihelion and waiting for General Relativity to bring clarity and 
elegance. 

Despite this caveat, the previous discussion shows that the search for 
non-baryonic dark matter is strongly motivated by present data, and three 
particle candidates. Weakly Interacting Massive Particles (WIMPs) , axions 
and massive neutrinos, are actively searched by several experiments. In the 
following, we will summarize the experimental situation and analyze the 
main detection strategies developed to identify the dark matter candidates. 

2 Weakly Interacting Massive Particles (WIMPs) 

2. 1 Phenomenology 

In particle physics, heavy neutrinos and supersymmetric relic particles rep- 
resent two rather natural candidates for non-baryonic Gold Dark Matter. 
In fact, as shown initially by the Heidelberg-Moscow experiment [24], most 
if not all of the mass range of Dirac neutrinos is already excluded exper- 
imentally, and the most popular candidates are given by supersymmetric 
particles. For models incorporating the conservation of a R-parity quantum 
number, it is natural to expect that a lightest supersymmetric particle exists 
that will be stable, and this sparticle, often supposed to be the neutralino, 
will represent an excellent Dark Matter candidate. It is also a fascinating co- 
incidence that, for a relatively large mass range, the cross-sections requested 
to produce a matter density of order unity are characteristic of electroweak 
interactions. In this sense, particle physics provides a natural explanation 
for the Gold Dark Matter problem. The hypotheses leading to the Min- 
imal Supersymmetric models (MSSM) are described extensively in Keith 
Olive’s contribution, this volume [25]. On the other hand, the constraints 
on the mass of supersymmetric particles are only phenomenological, and 
the LEP accelerator data impose a lower bound on SUSY particle masses 
of « 35 GeV under the MSSM hypothesis. 

The phenomenology of MSSM models has been described by a number 
of authors [26], with a rather considerable uncertainty in the event rates pre- 
dicted since some 106 parameters remain to be fixed to determine a specific 
SUSY model. Scalar and axial terms can be a priori considered for WIMP 
coupling to ordinary matter. Scalar terms lead typically to an approximate 

dependence of the cross-section with the number of nucleons, and are 
therefore usually predominant compared to axial couplings, which depend 
on the spin of the target nucleus. 

Event rates can range from a few events/kg/day for the most optimistic 
models, down to « 10“^ event/kg/day. By imposing constraints on the 
squark masses, these rates can be increased by nearly three orders of mag- 
nitude, but the justification for this procedure is unclear. Therefore, in order 
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to explore a significant part of SUSY models, an increase of sensitivity by at 
least four orders of magnitude is required. Direct and copious production of 
SUSY particles is of course expected to be observed at LHC around the year 
2007, but dark matter direct detection experiments are complementary since 
they can test the existence of a stable weakly interacting supersymmetric 
particle, a priori undetectable in an accelerator experiment. 

It might seem that the existence of a cosmological constant term con- 
tributing typically for 2/3 of the total energy content of the universe would 
further increase the difficulty for experimentalists to detect dark matter 
particles. On the contrary, the A term appears to result in an increase in 
the number of interactions with ordinary matter. This is due to the fact 
that a cosmological term will have no significant contribution locally, and 
that, on the contrary, a lower dark matter density at decoupling requires a 
higher annihilation cross-section and, rather generally, a higher interaction 
cross-section. 

We also expect that these remnant particles will be trapped in the grav- 
itational potential well of the galaxy. Since these particles, unlike ordinary 
matter, can hardly dissipate their kinetic energy, the halo formed by these 
particles is usually considered to be grossly spherical and non rotating (al- 
though Sikivie has argued that the infall of Dark Matter might result in 
caustic structures [27]). This halo would then extend to much larger dis- 
tances than the dissipating ordinary matter and explain the rotation curves 
observed in galaxies. The standard parameters used to describe the WIMP 
halo include its local density in the 0.3 — 0.5 GeV/cm^ range, an assumption 
of a Maxwellian velocity distribution with rms. velocity J^rms ~ 270 kms“^ 
and a WIMP escape velocity from the halo t'esc ~ 650 kms“^. Using this 
picture of a WIMP halo, and following the seminal paper by Drukier and 
Stodolsky [28] on coherent neutrino interactions, Goodman and Witten [29] 
proposed the method of WIMP direct detection involving elastic collisions 
of a WIMP on a nucleus, schematically represented in Figure 3. 

Except at high WIMP masses, where the momentum transfer might 
require to take into account more precisely the form factor of the nucleus, 
the interaction is coherent over the whole nucleus and the average energy 
in the collision can be approximated by the expression [30] : 



{E) = UlA 




myi 



mx + rriA. 



1.6 A keV^ 



TOX 



TOX + rUA 



It can readily be seen that for an optimal target mass, mx = toa, this 
energy is approximately 0.4 x A keV, giving typical energy transfers in the 
few keV to a few tens of keV range for SUSY particles with mass compatible 
with the constraints issued from the LEP experiments. 




G. Chardin: Dark Matter: Direct Detection 



303 



Target material 




Fig. 3. Schematic principle of a WIMP direct detection experiment. A WIMP 
scattering is inducing a low energy (a few keV to a few tens of keV) nuclear 
recoil, which can be subsequently detected by the phonon, charge or light signals 
produced in the target material. 



2.2 Experimental signatures 

With the previous characteristics of the WIMP halo particles, the expo- 
nentially decreasing shape of the WIMP recoil energy spectrum is anything 
but distinctive from the radioactive background, whose energy distribution 
is usually also raising at low energies. For obvious kinematical arguments, 
however, WIMPs give detectable recoils almost exclusively on nuclear re- 
coils, as opposed to the radioactivity, which involves mostly electron recoils 
at low energies. Therefore, several methods have been devised to distinguish 
as efficiently as possible nuclear from electron recoils. When this discrimi- 
nation is achieved, the main remaining background is due to neutrons and 
surface electron and nuclear recoils. 

In addition, two other signatures might be used. Firstly, in the hypoth- 
esis of a non-rotating spherical halo, the interaction rate is expected to be 
modulated by the variation in relative velocity between the (supposedly non 
rotating) galactic halo and the Earth in its trajectory around the Sun [31]. 
The velocity of the Earth through the Galaxy can be represented by the 
following expression: 



^'Earth = >^Sun + I^orh COSJCOs[uj{t - to)], 
where t'orb ~ 30 kms“^ is the Earth orbital velocity around the Sun, the 
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angle 7 « 60° is the inclination of the Earth orbital plane with respect to 
the galactic plane, u> « 27t/ 365 radian/day, and the phase is given by to = 
June 2"'^. Due to its importance, this annual modulation signature will be 
discussed in more detail in a following section. Secondly, a detector sensitive 
to the recoil direction would be able to measure its diurnal modulation due 
to the Earth rotation on its axis [32]. However, at the very low energies 
involved in the WIMP interactions, this represents an ambitious objective, 
which has yet to be realized. 

Once a WIMP signal will have been observed in a first detector type, the 
interaction rates on different target materials {e.g. Ge, Si, Na, I...) should 
be compared to test the consistency of the WIMP hypothesis, together with 
the vector or axial coupling of their interactions with ordinary matter. 

A different experimental identification is provided by the indirect detec- 
tion techniques, which use the fact that WIMPs may a priori accumulate 
at the center of the Earth, of the Sun or even at the galactic center [33]. 
The WIMPs may then annihilate in sufficient quantities for the high energy 
neutrino component produced in the disintegrations to be detectable at the 
surface of the Earth and identified through their directional signature. 

In the following, we will review the direct detection experiments ac- 
cording to their radioactive background identification capabilities (nuclear 
versus electron recoils) and briefly compare the sensitivities of direct vs. 
indirect detection methods. 

2.3 WIMP direct detection experiments without discrimination 

The characteristics of some of the main direct detection experiments without 
background discrimination are presented in Table 1. With such detectors, 
two requirements, low energy threshold and low radioactive background 
rate, must be met to achieve the sensitivities required for WIMP detec- 
tion. Initial Dark Matter direct detection experiments were by-products of 
double-beta decay experiments using germanium detectors [24, 34, 35] . To 
detect low mass WIMPs and to test the Cosmion hypothesis, an ad hoc 
particle which might have solved the Solar neutrino deficit problem, a ded- 
icated experiment using a silicon detector has been realized by Caldwell 
et al. [36]. 

For most of the detectors, it is fundamental to note that, for a given 
initial energy deposition, electronic and nuclear recoils usually provide very 
different signal amplitudes. The ratio of the nuclear/electron yield for the 
same initial energy is usually called the quenching factor, and a priori de- 
pends on the energy. For example, to produce the same light output as a 
3 keV electron recoil, e. 5 ., an iodine nucleus (quenching factor « 0.08) will 
have to deposit 3/0.08 « 35 keV, and a sodium nucleus (quenching factor 
K. 0.3) « 3/0.3 = 10 keV. In other words, a small quenching factor for 
nuclear recoils increases the effective threshold for WIMP interactions in 
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energy [keV] 

Fig. 4. Low energy spectrum of one of the enriched ^®Ge detectors of the 
Heidelberg- Moscow experiment. By adjusting the amplitude of the theoretical 
recoil spectrum (here, for a 100 GeV WIMP mass), the limit of sensitivity for a 
given WIMP mass can be derived (from Baudis et al. [37]). 

inverse proportion of this factor, whereas the signal/noise ratio improves by 
the same factor since nuclear recoils are more effectively packed in a smaller 
visible energy interval. 

2.3.1 Germanium detectors 

For coherent coupling, the most sensitive of the present non discriminat- 
ing experiments use ultrapure germanium detectors at liquid nitrogen tem- 
perature. In this respect, the Heidelberg- Moscow and the IGEX exper- 
iments [37, 38] are reaching sensitivities comparable to those of the best 
discrimination experiments. These two experiments benefit from the high 
purification performances developed industrially for the fabrication of high 
purity Ge detectors. In addition, enriched ^®Ge detectors reduce by more 
than one order of magnitude the cosmogenic production of ®®Ge. For these 
detectors, the ionization yield is typically three times smaller for nuclear 
recoils than for electron recoils (quenching factor « 0.3). 

The Heidelberg-Moscow experiment [37] obtains the most impressive 
background performances with a rate of 0.07 event/kg/keV/day at a thresh- 
old energy of 9 keV electron equivalent (e.e.), corresponding to a nuclear 
recoil energy of « 30 keV (Fig. 4). At an energy of 20 keV e.e. (« 60 keV 
recoil energy), the rate is even reduced to « 0.015 event/kg/keV/day. Rates 
observed by the IGEX experiment [38] are slightly higher, but this experi- 
ment benefits in its detectors from a lower energy threshold, « 4 keV e.e., 
or « 13 keV recoil energy, making it sensitive to low energy WIMP interac- 
tions. These two experiments obtain very similar sensitivities, at the level 
of 10“® pb for WIMP-nucleon cross-section (Fig. 5). 
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Fig. 5. Scalar coupling WIMP-nucleon cross-section upper limits obtained by 
direct detection experiments. The regions above the curves are excluded at 90% 
CL. Also represented are the allowed MSSM models compatible with accelerator 
data, the region selected by the DAMA candidate, and the expected sensitivities of 
some of the main direct detection experiments. The dashed-dotted line represent 
the expected sensitivity of the CDMS, EDELWEISS and HDMS experiment in 
their present stages, and the dashed line represents the approximate expected 
sensitivity of the next stages of the CDMS-II, CRESST-II, EDELWEISS-II and 
GENINO experiments. 



A development following the Heidelberg-Moscow experiment, the HDMS 
detector is using a small Ge detector shielded inside in a well-type Ge de- 
tector. The outer crystal is acting as an anti-Gompton veto where the 
gamma-ray background is identified by its interaction in both detectors. 
The HDMS experiment is now operating in the Gran Sasso laboratory, but 
has not yet improved the sensitivity obtained by the previous Heidelberg- 
Moscow experiment [39]. 

A second and more ambitious approach is proposed in the GENIUS 
project, also by the same group [40]. Here, a total of « 300 Ge detectors 
of individual mass 2.5 kg would be operated in a 12 m diameter tank filled 
with extremely radiopure liquid nitrogen. The GENIUS concept would be 





G. Chardin: Dark Matter: Direct Detection 



307 



tested in a 100 kg stage, GENINO [41], and several prototype detectors have 
already been operated in liquid nitrogen with excellent energy resolution and 
threshold. This project will be discussed in the following. 

2.3.2 Scintillator detectors 

The ELEGANTS experiment, using Nal and CaF 2 crystals [42,43], and 
the BPRS and DAMA experiments with CaF 2 crystals [44,45], have used 
the scintillation properties of these crystals to test the existence of WIMPs 
with axial (also called spin-dependent) couplings. The non- zero spin of 
the sodium and fluorine nuclei, and the extremely favorable nuclear spin 
matrix element of the fluorine nucleus [45] make these experiments among 
the most competitive for dominant axial interactions (see, however, “Direct 
vs. indirect detection”). At higher WIMP mass, the Osaka ELEGANT-VI 
experiment [43] obtains one of the most sensitive and reliable limits using a 
high-purity CaF 2 detector. On the other hand, the radioactive background 
rates achieved for these crystals (of the order of a few events/kg/day/keV) 
are much higher than that of the germanium detectors and, without efficient 
discrimination technique, these experiments are not competitive for scalar 
(or spin independent) interactions. 

2.3.3 Gryogenic experiments 

Over the last ten years, cryogenic detectors have been developed by several 
groups [46,47]. The difficulty of using cryogenic systems with temperatures 
in the tens of millikelvin range has been justified by the increase in sensi- 
tivity and redundancy in information obtained by the cryogenic detectors, 
or bolometers. At very low temperatures, it becomes possible to consider 
real calorimetric measurements of very small energy deposition, and energy 
thresholds below 1 keV of recoil energy have already been achieved [48,49]. 
Indeed, at a temperature of about 10 mK, a I keV energy deposited in 
a 100 g detector results in a typical temperature increase of about 1 /rK, 
which can be measured using conventional electronics. In addition, the cost 
of an elementary excitation is much lower than that of classical detectors 
such as semiconductors or scintillators. Therefore, cryogenic detectors offer 
the possibility of unprecedented sensitivities. The fundamental resolution 
of these detectors, given by thermodynamic fluctuations in the energy of 
the detector: 

AEfwhm ~ 2.35\/fcB CT^, 

where /cb is the Boltzmann constant, C is the heat capacity of the detector, 
and T is the temperature, is in the tens of electron-volt range for detectors 
in the 100 gram range at a temperature of 20 mK. In addition, the high 
energy phonons produced after the interaction are rapidly degraded but 
have relatively long lifetimes for individual phonon energies of the order of 
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10 ^ eV (about 1 kelvin). These phonons can be detected in the purely 
thermal mode [49-56], or when they are still out of equilibrium [48,57-61]. 

The CRESST [48], ROSEBUD [49], Tokyo [52] and MIBETA [50,51] 
experiments are presently using their cryogenic detectors for WIMP search 
without background discrimination techniques (see Tab. 1). Having solved 
initial problems of spurious events at low energies, the CRESST experiment 
is now using its sub-keV energy threshold on massive 262 g sapphire detec- 
tors (light A1 and O nuclei) to explore the WIMP mass region of a few 
GeV to a few tens of GeV, in principle excluded by accelerator data under 
conventional hypothesis, and possible axial couplings. 

With the radioactive background rates achieved until now, and without 
nuclear recoil identification, the MIBETA and ROSEBUD experiments [49- 
51] are presently not competitive but are pursuing efforts to reduce their ra- 
dioactive background rate and use their low energy thresholds to be sensitive 
to low WIMP mass. On the other hand, using lithium fluoride detectors [45] 
allows the Tokyo cryogenic experiment [52], although in a shallow site and 
using a set of 8 X 21 g crystals, to reach the best direct detection sensi- 
tivity for axially-coupled WIMPs with mass below 5 GeV. In a next stage, 
this experiment intends to move to an underground laboratory in order to 
decrease the level of its cosmic background. 

2.3.4 The purest of all materials 

In terms of radioactive background, what are the purest materials? 
Germanium, and possibly also silicon, are certainly two very good candi- 
dates since the amount of impurities which are achieved in detector quality 
crystals are of the order of 10^*^ impurities per cm^. In addition, a very 
small fraction of these impurities are radioactive, if the cosmogenic produc- 
tion of radioactive isotopes has been limited. Liquid nitrogen, proposed by 
GENIUS as a shield [40], can also be purified at extremely high levels. How- 
ever, there exist two liquids where the level of impurities is essentially zero. 
At subkelvin temperatures for superfluid ^He, and below 1 millikelvin for 
^He, essentially no impurities can remain in a stable way in these liquids. 
At a temperature of 100 /iK, not even a single atom of the chemical twin 
^He is able to remain at equilibrium in a kilogram of superfluid ^He! This 
extreme purity has been a motivation for proposing a dark matter detector 
where the temporary intrusion of an external particle creates quasiparticles 
which are detected by the attenuation of a vibrating wire inside the liquid. 
Energy thresholds of a few keV have already been achieved with small ^He 
samples, and a multicell detector has been proposed [62,63]. Two draw- 
backs are reducing, however, the attractiveness of this elegant detector: the 
cost of ^He is of the order of 1 000 US dollars per gram, and there is no dis- 
crimination proposed against the very low energy X-ray background which 
may mimic WIMP interactions. 
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Table 1. Main characteristics of some direct detection experiments without back- 
ground discrimination capabilities. 
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2.4 WIMP direct detection experiments with discrimination 

Several methods have been proposed, and some of them implemented in 
full-scale experiments, to distinguish nuclear recoils, through which WIMPs 
will primarily interact, from electron recoils. These methods include the 
difference in scintillation decay time for Nal crystals, the scintillation versus 
ionization yield in liquid xenon, the scintillation versus phonon yield in 
cryogenic CaW 04 crystals, or the difference in ionization versus phonon 
ratio in germanium or silicon detectors. The discrimination method usually 
involves the fact that low energy nuclear recoils correspond to very local 
energy deposition. A different sharing of energy between the more expensive 
degrees of freedom, like ionization (typically a few eV per electron-hole 
pair created) and light (typically a few eV to a few tens of eV per photon 
created), and the inexpensive degrees of freedom like phonons (typically 
10“^ eV) is then usually observed. 

Table 2 summarizes the main characteristics of some of the main WIMP 
direct detection experiments with such background identification. 

2.4.1 Nal experiments 

The first experiments that significantly improved their performances by 
using background rejection properties have been the Nal-based experim- 
ents [64-68]. These experiments are using the different scintillation time 
constant between electron and nuclear recoils to reject the gamma-ray ra- 
dioactive background. For coherent couplings, where the interactions are 
expected to occur predominantly on iodine nuclei due to the approximate 

dependence of the interaction cross-section, the average visible energy 
of WIMP interactions will be in the few tens of keV recoil energy range. 
This means that almost all the WIMP candidate events will lie very close to 
the energy threshold of the experiment (2—3 keV e.e., or 25—35 keV recoil 
energy, for the DAMA experiment). At these energies, with the present 
light collection efficiencies on large crystals of 3—5 photoelectrons/ke V e.e., 
it then seems impossible to use the time structure of the light pulse to dis- 
criminate efficiently nuclear recoils from electron recoils below « 6 keV e.e 
(« 70 keV recoil energy on iodine). Therefore, the DAMA and ELEGANT- 
V experiments [42,64,65] have used an annual modulation analysis to reach 
a higher sensitivity to coherent WIMP interactions with their Nal detectors. 

The factors limiting the discrimination performances of the Nal exper- 
iments at higher energies are the time structure of the scintillation, which 
appears to vary according to the energy interval, and to present at least two 
initially unexpected populations [68-71]. Both electron surface events [68] 
and nuclear surface events [68-70] present time characteristics which differ 
from the nuclear and electron volume recoil events, and must be included 
as additional degrees of freedom in the estimation of the nuclear recoil 
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Table 2. Main characteristics of some direct detection experiments with back- 
gronnd discrimination capabilities. 
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Charge amplitude (keV e.e.) 



Fig. 6. Nuclear and electron recoil separation exhibited in a phonon amplitude 
vs. charge amplitude scatter diagram, using a 70 g charge-phonon Ge detector of 
the EDELWEISS experiment exposed to a neutron and 7 -ray source (from 
Benoit et al. [74]). 



component. The situation is further complicated by the fact that de- 
graded alpha particles and heavy nuclei recoils at low energies {e.g. Po 
nuclei associated with surface alpha contamination when the alpha parti- 
cle is undetected) have yet different time constants. For these reasons, it 
seems unlikely that significant improvements can be accomplished using this 
technique. 

2.4.2 Cryogenic detectors 

Following initial developments on silicon detectors by Spooner et al. [72], the 
CDMS [58-60] and EDELWEISS [55-57, 61] collaborations have achieved 
high performances of particle identification using the simultaneous detec- 
tion of charge and heat signals in germanium and silicon detectors (Fig. 6). 
Rejections of the 7-ray background at a level of « 99% level have been ob- 
tained in germanium detectors with a charge measurement at low collection 
voltage, and a phonon measurement using Neutron Transmutation Doped 
(NTD) sensors. Despite the problem of incomplete charge collection for 
surface events [73,74], these detectors have become the most sensitive for 
WIMP Dark Matter search, and less limited by systematics than previous 
discrimination experiments. 

The Stanford group [58-60] has developed another particularly elegant 
and integrated method using superconducting aluminum traps [75], which 
collect phonons at energies exceeding the 1 K (« 10“^ eV) gap of 
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aluminum. The quasiparticles created in the aluminum films are transferred 
to thin tungsten sensor lines maintained by electrothermal feedback [76] in 
the normal-superconducting transition at a relatively comfortable temper- 
ature of 80 mK. The 1 K aluminum gap allows these detectors to be fairly 
insensitive to external temperature fluctuations and microphonics. In addi- 
tion, the fast response of transition edge sensors (TES), associated with a 
SQUID-array readout [77], provides a timing of the interaction at the mi- 
crosecond level. The signal amplitude repartition in the four sensors may 
then lead to a millimeter position resolution of the interaction, making these 
detectors also competitive for real-time solar neutrino detection. Even more 
importantly, these fast phonon detectors allow the identification of surface 
events by the reduced risetime of the phonon signals [78,79]. The CDMS 
experiment is now operating these ZIP (Z-sensitive Ionization and Phonon) 
detectors, prototypes for the larger scale CDMS-II experiment. 

Another line of development has been followed for the TES sensors de- 
veloped in the CRESST experiment, which offer at present the best energy 
sensitivities and thresholds well below 1 keV for massive sapphire detectors 
of 262 g at very low temperatures (« 12 mK) [48]. Even more importantly, 
the Munich group has achieved an excellent discrimination between elec- 
tron and nuclear recoils using the simultaneous detection of light and heat 
signals [80]. After initial developments by the Milano and Orsay groups at 
much higher energies [81,82], the Munich group [80] has achieved efficient 
particle discrimination down to energies of a few keV using a 6 g scintil- 
lating CaW 04 crystal. In this setup, the light emission is detected by a 
thin sapphire wafer, adjacent to the target crystal, and sensed by a TES 
thermometer of very high sensitivity. The discrimination performances of 
this small device (> 99.7% discrimination above 15 keV recoil energy) are 
already extremely impressive, and more massive 300 g crystal detectors are 
being developed which will be tested in the CRESST experiment. 

High impedance thin film sensors in the metal-insulator transition con- 
stitute an alternative to transition edge sensors, and already present ex- 
cellent energy resolution in the thermal regime [83]. Comparable to NTD 
sensors in terms of sensitivity, these him sensors benefit from a manufac- 
turing process allowing them to be deposited on a detector surface without 
a delicate manual intervention. At present, the NTD-based charge-phonon 
detectors, developed by the CDMS and EDELWEISS experiments [53-57] 
still offer the best performances for identification of nuclear recoils, with 
residual event rates at the level of « 0.01 event/kg/keV/day, correspond- 
ing to sensitivities to cross-sections of 10“® pb [84]. Therefore, a precise 
calorimetric measurement of the energy in the thermal mode represents an 
important alternative technique to the out-of-equilibrium sensors developed 
by the CRESST [48] and by the Stanford groups [58-60]. 




314 



The Primordial Universe 




500 1000 1500 



time (days) 



Fig. 7. Modulated amplitude (in evt/day/kg/keV) as a function of elapsed time 
recorded during the four phases of the DAM A Nal experiment, covering a time 
interval of three years. The vertical dashed and dotted lines represent, respec- 
tively, the expected position of maxima and minima of annual modulation under 
the conventional hypothesis of a homogeneous and non rotating WIMP halo (from 
Bernabei et al. [85]). 



2.4.3 A first WIMP candidate? 

In 1998, the DAMA experiment, using a total mass of « 100 kg of high 
purity Nal crystals, has reported a first indication of an annual modulation 
using a data set of « 12.5 kg x year, recorded over a fraction of a year [65]. 
Apart from the ELEGANT-V experiment [42], which is using Nal scintilla- 
tors of total mass 730 kg, the DAMA experiment is presently running the 
largest experiment for WIMP direct detection. Compared to ELEGANT- 
V, DAMA is using Nal crystals with a lower radioactive background, with 
differential rates at low energies of approx. 2—3 events/kg/keV/day down 
to an energy of 2 keV electron equivalent (e.e.) (« 25 keV recoil energy). 

After its initial report and a second data set of « 41 kg x year where the 
modulation was confirmed [65], the DAMA group has recently published 
an analysis involving a 160 kg x year data sample recorded over a three 
year time interval [85]. The resulting modulation can be seen in Figure 7. 
Taken at face value, the DAMA observation presents a > 4 sigma statistical 
significance, with both phase and amplitude consistent over a period of three 
years with a WIMP signature. Interpreted in terms of a WIMP candidate, 
the mass appears to be « (52 ± 10) GeV and the WIMP-nucleon cross- 
section R5 (7 ± 1) 10“® picobarn. The allowed region, delimited by a three 
sigma contour, is represented in Figure 5. If confirmed, this observation 
would appear to favor relatively high values of the tan(/3) parameter in a 
relaxed SUGRA scheme [86]. 

In the shallow Stanford site, despite a huge muon background, the CDMS 
experiment has reached a sensitivity excluding a large part of the DAMA 
region. Using a subset of three 160 g Ge detectors with NTD sensors, 
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Fig. 8. Charge vs. recoil energy scatter diagram (anticoincident with muon veto) 
recorded by the CDMS experiment at the Stanford shallow site using three 160 g 
Ge detectors. Solid curve: expected position of nuclear recoils. Dashed curves: 
limits of the nominal 90% nuclear-recoil acceptance region. Vertical dashed line: 
10 keV analysis threshold. Circled points: nuclear recoils (from Abusaidi et al. 
[84]). 



the CDMS experiment [84] has reached the level of « 1 evt/kg/day for 
a « 10 kg X day exposure, convincingly identifying a signal of nuclear 
recoils (Fig. 8). On the other hand, the exclusion of most of the DAMA 
region has only been realized at the expense of a subtraction of the neutron 
background, estimated from the number of interactions in silicon detectors, 
and from the multiple scatter interactions in adjacent germanium detectors. 
Without this difficult background subtraction, the number of nuclear recoils 
observed by CDMS is just compatible with the DAMA claim, and it seems 
that more precise experiments are required to draw a definite conclusion on 
this fundamental question. The CDMS experiment is presently running its 
new generation of ZIP detectors with an additional neutron shield close to 
the detectors and may bring a more decisive answer in the near future. 

2.4.4 Critical discussion and annual modulation signature 

Despite the considerable interest generated by the DAMA announcement 
which, if verified, would entail the discovery of the first supersymmetric 
particles, a number of criticisms have been raised against the DAMA analy- 
sis [87]. Some of these criticisms address the annual modulation technique. 
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which we discuss in more detail in the following since it represents one of 
the most celebrated signatures of a WIMP signal. 

The annual modulation signature for WIMP search must be used with 
extreme caution. Even if we disregard the fact that some authors (see 
e.g. [27]) have noted that inhomogeneities in the WIMP halo could lead 
to severe modifications to the standard parametrization of the modulation, 
several difficulties must be noted which must be mastered in order to use 
this signature. 

As noted previously, the amplitude of the WIMP annual modulation 
induced by the motion of the Earth around the Sun: 

^'Earth = l^Sun + J^orb COS7COs[w(t - to)], 

is of the order of 7%, under the optimistic assumption that all events 
recorded are due to WIMP interactions. On the other hand, the few exper- 
iments which have tried until now to use the annual modulation technique 
are using detectors with very limited or no background discrimination ca- 
pabilities at low energies, where most of the signal is expected to appear. 
The signal over noise ratio in the signal region will correspondingly reduce 
the already small amplitude of the modulation. 

Ramachers et al. and Hasenbalg [88, 89] have studied the minimal de- 
tector mass required to efficiently detect the modulation signature. Their 
study shows that detectors in the 100 kg range such as the DAMA or 
ELEGANTS detectors can be effectively used for this purpose assuming 
the most favorable SUSY models and an almost pure WIMP signal at low 
energies (Fig. 9). 

Even under these extremely favorable assumptions, however, a reliable 
detection of the annual modulation in the interaction rate already requires 
a stability of the detector performances much better than 1%. On the other 
hand, the detector mass must be increased to several tons or more for most 
of the SUSY models. Even for the most favorable models, a more realistic 
10% WIMP proportion in the recorded data requires a mastering of all the 
calibration factors and systematic effects at a level smaller than 0.1%, which 
appears extremely ambitious, and probably unrealistic close to threshold. 

From the previous comments, it is clear that the control of spurious 
modulations is essential. In fact, temperature and environmental effects are 
expected to present seasonal and yearly effects. An example is provided by 
the seasonal effects on the atmospheric decay region induced by the baro- 
metric pressure variations, which result in a modulation of the high-energy 
muon flux, observed for example by the MACRO experiment. This seasonal 
variation in the muon flux in turn results in a modulation of the neutron 
background induced by high-energy muon interactions in the vicinity of a 
dark matter experiment, in particular in its lead shielding. 
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Fig. 9. Minimum target mass required for a direct detection experiment to detect 
an annual modulation WIMP signature as a function of the WIMP mass, expressed 
in GeV (dots). A zero background rate and an optimistic WIMP candidate rate of 
1 cpd/kg are assumed. The influence of a 10® cpd background (triangles) and of an 
expected rate of 0.1 cpd/kg (crosses) are also shown (from Ramachers et al. [88]). 
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A more mundane, and a priori much more dangerous, spurious modula- 
tion results from the variation of the trigger rate close to threshold, an effect 
which must be carefully monitored by looking at event rate modulations be- 
low or close to the effective threshold. In this respect, the DAMA group 
has tested the stability of its data taking conditions by using as a reference 
population the high energy event population {E > 90 keV). But, although 
these events, well above threshold, may actually present an absence of (an- 
nual or daily) modulation, the data stability very close to threshold is of 
greater concern. In particular, the stability of the selection cut between 
the physical events and the photomultiplier noise is essential and should be 
tested against the presence of not only annual, but also daily and weekly 
modulations. 

In effect, it would be extremely surprising that low energy data at the 
threshold level would not present, e.g., a daily modulation since residual 
temperature variations and human activities in the lab do present such 
modulations. It seems therefore essential, to ascertain the possible observa- 
tion of a WIMP signal, that not only the annual periodicity, but also the 
whole Fourier spectrum is checked against spurious modulations in the data. 
The fact that the initial observation of anomalous events in the UKDMC 
Nal experiment also presented a summer-winter difference [69] is a further 
illustration that spurious modulations may easily affect low energy data. 

Another question raised by the DAMA data is related to the cuts and effi- 
ciency corrections realized close to threshold. In particular, it was noted [87] 
that two of the nine DAMA detectors presented steep variations of the event 
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Fig. 10. Modulated amplitudes as a function of the deposited energy obtained 
from the data of the DAMA/Nal experiment and reanalyzed by the Sierra Grande 
experiment (from Abriola et al. [90]). The solid line corresponds to the signal 
expected from a 60 GeV WIMP with cross-section crw-nucieon = 1.0 x 10“® pb. 
The energy resolution of the Nal detector (which was not taken into account in 
the calculation) would further decrease the amplitude of the curve by a factor 2. 



rate close to threshold that appear to be incompatible with the detector en- 
ergy resolution. In addition, these two detectors exhibit event rates in the 
2—3 keV interval of 0.5 event/kg/keV/day, a factor 20 below the Saclay 
experiment [68] which has used a subset of the Nal crystal detectors of the 
DAMA experiment [64,65]. With this event rate, the remaining radioactiv- 
ity must be close to zero once the WIMP contribution has been subtracted, 
which seems extremely surprising. 

Other authors have also noted [87, 90] that the initial claim of an annual 
modulation by the DAMA group [88] presented several inconsistencies. In 
particular, the initial signal appeared to be significant in only two out of 
the nine crystals, and the excess events attributed to the candidate WIMP 
presented an energy distribution extending well beyond that expected from 
a 60 GeV WIMP (Fig. 10). As shown by the 4 sigma statistical significance 
of the recent DAMA report, the first observation, with a ten times smaller 
statistics, lacked the statistical precision required to establish its claim and 
is in complete contradiction with the latest CDMS result [84]. It then seems 
quite improbable that additional data could confirm an initially inconsistent 
observation. 

In conclusion, it appears unlikely that a single experiment will con- 
vincingly demonstrate the existence of a WIMP signal through the annual 
modulation technique if it cannot demonstrate by some discrimination pro- 
cedure that a reasonably pure sample of nuclear recoils has been selected. 
In any case, this annual modulation signal should soon be tested by more 
precise experiments such as CDMS, EDELWEISS or Sierra Grande. 
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2.5 Other discrimination techniques 

2.5.1 Liquid xenon detectors 

Xenon represents a target adapted to high mass WIMP detection through 
coherent couplings, and also benefits from non-zero spin isotopes. Having 
been used in the liquid phase by the DAMA experiment [92], double phase 
xenon detectors have been studied by the ICARUS experiment [93]. In these 
last detectors, the temporal structure of the scintillation signal allows to 
discriminate nuclear from electron recoils [94]. The discrimination is based 
on the difference in ionization density, where the recombination leads to a 
fast time structure for the nuclear recoil scintillation (< 3 ns recombination 
time) compared to electron recoil risetime (« 40 ns recombination time). 
The rejection factors that have recently been obtained are excellent and 
comparable to those obtained by the cryogenic detectors. Therefore, a 5 
kg liquid Xe detector is presently built by the UKDMC collaboration, and 
a 30 kg detector has been proposed by the ZEPLIN-II collaboration [95]. 
Still, these detectors have yet to demonstrate in a real scale experiment that 
they can cope with environmental radioactivity and systematic effects. 

2.5.2 SIMPLE and PICASSO 

A particularly elegant and inexpensive way to discriminate the gamma and 
electron radioactive background from nuclear recoils appears to be proposed 
by the PICASSO [96] and SIMPLE [97] experiments. These experiments use 
gels loaded with freon liquid droplets in a metastable state at normal tem- 
peratures. Under suitable control of pressure and temperature conditions, 
the very local deposition of energy induced by a nuclear recoil is able to trig- 
ger the nucleation of a bubble which can be detected by acoustical means 
using piezzo-electric sensors. On the other hand, the external conditions 
can be simultaneously chosen such that the energy density along tracks of 
minimum ionizing particles, which represent most of the radioactive back- 
ground, is insufficient to trigger the nucleation. Encouraging preliminary 
results have recently been presented by the SIMPLE experiment [97] which 
may allow reaching sensitivities for spin-dependent interactions competitive 
with that reported by the presently most sensitive experiments. However, 
to reach this goal, these digital experiments will have to get rid of their in- 
ternal alpha contamination, which is also able to trigger bubble formation. 
Unable to measure the deposited energy, these digital experiments should 
also demonstrate that they are sensitive in stable conditions to the very 
low energy recoils induced by fast ambient neutrons or, for that matter, to 
WIMP interactions. Compared to the conventional techniques, these inex- 
pensive detectors may provide a convenient calibration tool of the neutron 
flux in deep underground sites, which will represent one of the ultimate 
backgrounds of direct detection experiments. 




320 



The Primordial Universe 



2.6 Detecting the recoil direction 

Assuming a non-rotating halo relative to the galaxy, WIMPs will exhibit 
a strong anisotropy with respect to the laboratory frame, rotating with a 
daily period and reflected in their recoils within a detector [32]. Therefore, 
several groups [98-109] have addressed the challenging experimental ques- 
tion of determining the recoil direction under a WIMP interaction in order 
to measure this strong directional signature. Compared to the annual mod- 
ulation signature, which usually presents at best a modulated amplitude of 
7%, the recoil direction, at least for nuclei with a mass comparable to that 
of the incident WIMP, might present much larger anisotropies, of the order 
of 50%. In addition, the directional signature will increase with the recoil 
energy and is therefore much less sensitive to threshold effects than the an- 
nual modulation technique. The difficulty lies of course in the possibility to 
reconstruct the recoil direction in an interaction of a few keV to a few tens 
of keV at most. 

For such small deposited energies, the nucleus typically recoils by sub- 
micron sized distances in usual solid state detectors. A first technique allow- 
ing to face this problem has come from the use of micas where particles with 
a high density of ionization create local deterioration of the crystal, which 
can later be revealed by etching and measured by atomic force microscope 
(AFM) . Therefore, micas present a natural discrimination against the usual 
radioactive background and record only the most ionizing particles [98-102]. 
However, the use of micas has been limited until now to rather small samples 
since the association of tracks on cleaved planes is necessary to select the 
track energy range. Still, the interactions are recorded for typically a billion 
years for very old samples, and the mica technique can lead to very large 
exposures even for small samples. This technique has been developed by the 
Berkeley group and has led to sensitivities at high WIMP mass comparable 
to the best reported limits using conventional techniques [98-102]. 

In practice, these experiments will be ultimately limited by the neutron 
background, which will be superimposed on a possible WIMP signal. A 
possible way to increase the sensitivity would be to reconstruct the direc- 
tional anisotropy of the etched tracks and, supposing the initial orientation 
of the mica sample is known, to associate it with the WIMP interactions 
anisotropy [101,102]. In practice, however, this signature will be blurred 
by the averaging produced by the various rotations of the Earth, and by 
the possible anisotropy in the local neutron background. It is therefore not 
clear that significant improvements over existing results can be realized. 

A completely different technique has been proposed in the context of 
the HERON project [104] for real-time solar neutrino detection. In the 
remarkably homogeneous medium of superfluid helium, a local deposition 
of energy with a high density will create an opaque roton cylinder around the 
track. Due to Lambert’s law, the resulting roton emission is then anisotropic 
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Fig. 11. Schematic view of the DRIFT (Directional Recoil Identification From 
Tracks) detector using a gas target to measure the anisotropy in WIMP interac- 
tions (from Lehner et al. [109]). 



and can be correlated to the incident particle direction [105]. However, this 
beautiful experiment has only been demonstrated with alpha particles of a 
few MeV and is probably inoperative at the much smaller recoil energies 
characteristic of WIMP interactions. 

A third, and probably the most promising, technique to identify the re- 
coil direction under a WIMP interaction is developed in the context of the 
DRIFT (Directional Recoil Identification From Tracks) project [106]. The 
principle of this directionality measurement is represented in Figure 11. 
The initial idea, proposed by Rich and Spiro [107], was to use a low pres- 
sure TPC immersed in a high magnetic field. The low pressure TPC offers 
a gas target of sufficiently low density (typically a few to a few tens of 
Torr) to enable the track length determination of a recoiling nucleus. The 
electron recoil vs. nuclear recoil discrimination is then achieved by aligning 
precisely the magnetic field along the TPC axis. Under these conditions, 
nuclear recoils, with track length in the centimeter range for recoils of a 
few tens of keV, are discriminated against the much shorter electron recoil 
tracks, which are spiraling with a very small radius at these low energies. 
Initial measurements by Buckland et al. at UCSD [108] have demonstrated 
the feasibility of this technique by using an optical read-out associated with 
a CCD camera that greatly reduced the cost of transverse position mea- 
surement. However, the low density of the target material, requiring TPCs 
of hundreds of cubic meters for a 1 kg experiment, and the requirement 
of a superconducting magnet over a very large volume were obviously two 
strongly limiting factors [109]. 

An important development, allowing to get rid of the magnetic field, was 
demonstrated in the DRIFT project [106] by using negative ion drift where 
the lateral diffusion is strongly reduced in the ratio of the ion mass to that 
of the electron. However, even using the projected performance objectives 
of the DRIFT detector, the directional modulation amplitude appears until 
now to be « 20%. This should then be compared to the typical annual 
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modulation amplitude, both these numbers supposing a pure WIMP signal, 
without neutron background or degraded alpha contamination originating 
from the wires. 

In conclusion, although several developments have been realized to mea- 
sure the recoil direction in the relevant energy range of the WIMP signal, 
which would represent a very important identification signature, it remains 
to be proved that a sufficient target mass can be used in a realistic and 
effective detection scheme. 

2. 7 Low-energy WIMPs trapped in the Solar System 

Using as a reference the population of WIMPs trapped in the Sun or the 
Earth [IIO], Damour and Kraus [111] have noted the possible existence of 
an initially unexpected low-energy WIMP population trapped in the inner 
regions of the Solar system. These trapped particles correspond to WIMPs 
that have suffered an elastic scattering in the Sun capturing them in a bound 
orbit in the Solar system. If they stay for a long time along these orbits, 
these WIMPs will be eventually captured at the center of the Sun by further 
nuclear scattering. But the perturbative action of the planets of the Solar 
system and, most notably, of Jupiter, is able to drive a significant number 
of these WIMPs in safer orbits that do not intersect the Sun anymore. 
Compared to the standard WIMP population, whose expected average rms 
velocity is « 250 km/s, these WIMPs trapped in the Solar system have 
typical velocities of the order of 30 km/s. 

As a consequence, the typical energy transferred in a collision with a nu- 
cleus is typically a fraction of a keV, instead of a few tens of keV. Detection 
of such small energy transfers has been already realized by the CRESST 
experiment, with energy thresholds of « 500 eV for 262 gram sapphire 
detectors [48]. From this respect, the performances in terms of energy sen- 
sitivity and resolution reached by the CRESST experiment are almost those 
required for an efficient detection of this low-energy WIMP population. 

Although impressive, these performances are not sufficient by them- 
selves, however, to readily identify a WIMP signal. In effect, the radioactive 
background at these very low energies is usually > 1 event/kg/keV/day. 
Without background discrimination allowing the identification of nuclear 
recoils, it then seems impossible to ascertain the presence of a WIMP com- 
ponent in the huge electronic background, and with an event rate diverging 
close to threshold. On the other hand, the discrimination techniques that 
have been developed until now are only reasonably efficient above a typical 
energy of 10 keV recoil energy at best [57,84]. Therefore, to efficiently de- 
tect this low energy population, the discrimination energy threshold must 
be decreased by almost two orders of magnitude. Now, for a nuclear re- 
coil of 300 eV, only « 25 electron-hole pairs are created, which makes 
this detection nearly impossible for conventional FET-based electronics. 
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This identification may become possible, however, by using Single Electron 
Transistor (or SET) electronics developed over the last few years by several 
groups [112-114]. These SETs, which are the charge analog of the quantum 
current measuring SQUID device, may allow the measurement of such very 
small charge signals coupled to small capacitance detectors (typically 1 pF). 

2.8 WIMPs with dominant axial interactions. Direct vs. indirect detection 

In a small fraction of the SUSY Monte-Carlo universes drawn at random 
by particle theorists (among which at most one of these models is actu- 
ally realized), WIMPs may present axial couplings dominating the scalar 
interaction term. In this case, direct detection experiments do not bene- 
fit from the approximate interaction term over large A nuclei. In this 
unfavorable case, direct detection experiments have the greatest difficulties 
remaining competitive with indirect detection experiments. An optimized 
direct detection experiment would involve a detector mass of a few kg, mea- 
suring the incident WIMP anisotropy through that of the recoil nucleus and 
with a null background. Kamionkowsky et al. [115] have shown that, even 
under these optimistic assumptions, indirect detection experiments with ef- 
fective detection surface in the 0.1—1 km^ range, such as AMANDA-II, 
ICECUBE or ANTARES [116,117], would present better sensitivities than 
direct detection experiments (Fig. 12). It is not even clear that, for a purely 
axial interaction, and with the present detection strategies, direct detection 
methods can even test a significant part of the SUSY parameter space com- 
patible with accelerator constraints. On the other hand, although WIMP 
annihilations may appear required for Majorana particles, our present fail- 
ure to understand the real origin of CP violation, together with the almost 
total lack of antimatter in our vicinity, should induce us to realize that both 
direct and indirect detection experiments are essential. From this vantage 
point, the DRIFT project (see Directionality detection), assuming a tar- 
get mass of the order of a few kilograms and nominal performances, would 
probably offer a sensitivity comparable to that of the present stages of the 
AMANDA, Kamiokande and Baksan experiments [116,118,119], with effec- 
tive luminosities in the 10“^ km^ x year range. 

Figure 13 presents some of the most significant limits reported by direct 
detection experiments and the region of parameter space allowed for SUSY 
models. The direct detection experiment claiming the best sensitivity is, 
as for the coherent coupling, the DAMA experiment, with its 100 kg Nal 
detector [64,65]. Two other Nal based experiments, UKDMC and Saclay, 
report somewhat lower sensitivities [67, 68] but, on the other hand, take 
into account the existence of anomalous events [68-71], which reduce at 
present the discrimination capabilities of these experiments by nearly one 
order of magnitude. Even if we assume that the DAMA experiment does 
not observe this population of anomalous events, its very existence imposes 
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(GeV) 

Fig. 12. Comparison of direct vs. indirect detection sensitivities for scalar- and 
axial-coupled WIMPs. The sensitivity comparison is given as a function of the 
WIMP mass and expressed in terms of the ratio between the detector surface for 
indirect detection and the detector mass for a direct detection experiment (from 
Kamionkovsky et al. [115]). 



an additional degree of freedom which probably reduces the sensitivity of 
the DAMA result by nearly an order of magnitude, using the UKDMC and 
Saclay experiments as references. 

In any case, it can be seen in Figure 13 that direct detection experiments 
fall short by nearly two orders of magnitude to begin exploring the MSSM 
allowed phase space compatible with accelerator data for axial couplings. 



2.9 Testing (a significant part of) the SUSY models 

At present, the most sensitive direct detection experiments are just barely 
beginning to explore the region of allowed SUSY models for scalar domi- 
nant couplings. On the other hand, exploring a large part of the MSSM 
parameter space will require a typical sensitivity increase by four orders 
of magnitude over the best achieved performances. This objective, corre- 
sponding to less than 1 event per ton of detector and per day, appears 
extremely ambitious. Although commensurate with the event rates in solar 
neutrino detection experiments, WIMP elastic scattering interactions, apart 
from their preferential nuclear interactions, are far from being distinctive 
and involve energies in the few tens of keV range. 
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Fig. 13. Axial interaction cross-section upper limits normalized to WIMP-proton 
cross- section obtained by direct detection experiments. The regions above the 
curves are excluded at 90% confidence level. Also represented are the allowed 
MSSM models compatible with accelerator data. The DAMA result (dashed 
curve) does not take into account the possible presence of anomalous (surface) 
events and should probably be reevaluated. 



Reaching these sensitivities at these low energies will then require an 
extreme control of the radioactive background and ambitious discrimina- 
tion schemes. In the following, we explore the main experimental strate- 
gies proposed to significantly extend the sensitivity achieved by present 
experiments. 



2.9.1 Neutron background 

Today, the most sensitive experiment in terms of background identification 
and control of the systematics, the CDMS experiment, is presently limited 
by the neutron background associated with muon interactions in the rock of 
the shallow Stanford site, at a level of « 1 evt/kg/day. Although neutrons 
are obviously strongly interacting neutral particles, their interaction cross- 
sections are sufficiently small to represent a significant threat to WIMP 
direct detection experiments. Therefore, to significantly improve on the 
present CDMS sensitivity, all experiments will require the protection of a 
deep underground site. Indeed, after a few meters of rock, the dangerous 
hadronic component of cosmic rays has interacted and disappeared. But 
the muon flux is still largely present at shallow depth, at the level of a few 
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Fig. 14. Event categories observed by the EDELWEISS experiment in the Frejus 
Underground Laboratory. In addition to the two initially expected populations of 
nuclear and electron recoils, two additional populations, corresponding to electron 
and nuclear surface recoils are also observed, which represent potentially impor- 
tant limitations to the performance of existing direct detection experiments (from 
Benoit et al. [74]). 



muons per m^/s. Although muons in the close vicinity of the experiment can 
be detected and vetoed by a specific veto detector, deep inelastic scattering 
of muons in the surrounding rock creates hadronic showers far from the 
shower core, which are difficult to detect and veto. This is presently the 
main limitation of the CDMS experiment [84]. 

For this reason, experiments must eventually be installed in deep under- 
ground laboratories where the muon flux is reduced by factors « lO"^— 10® 
compared to ground level, corresponding to a few 10^ to a few muons 
per m^/day. At this stage, most of the neutrons originate from fission 
and spallation processes in the surrounding rock, at the level of a few 
10“® neutrons/cm^/s for the best sites. These neutrons, with a fiux typ- 
ically a factor 100 to 1000 larger than the muon fiux, can be strongly at- 
tenuated by a few tens of centimeters of a low-Z shielding, such as paraffin 
or polyethylene. Without this protection, this neutron fiux component can 
already be detected by discrimination experiments like EDELWEISS [74], 
operating in an underground site (Fig. 14). 

Another neutron component is associated with the neutron production 
by muons crossing the lead shield, acting here as a neutron multiplier. This 
neutron background can be effectively reduced to a negligible level by iden- 
tifying and vetoing the muons crossing the protective setup around the 
detectors. 
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Fig. 15. Hexagonal configuration of the germanium detectors proposed in the 
GENINO experiment. By using medium sized detectors (« 1 kg), the identifi- 
cation of neutrons can be made by multiple scattering interactions in adjacent 
detectors. 



Finally, even under the protection of thousands meters of water equiva- 
lent, high-energy muons with energies in the TeV range may lose in a catas- 
trophic way a large fraction of their energy in a deep-inelastic scattering 
interaction on a nucleus. Typically 10% of the long range hadronic compo- 
nent is carried away by neutrons, at very low shower densities, making this 
fast neutron background extremely difficult to detect. A large passive shield 
of light material, or preferably a large active scintillator shield, as consid- 
ered, respectively, for the GENIUS and the Borexino experiments [40, 120] 
can strongly reduce this small but dangerous background. 

The residual neutron background can be efficiently monitored by study- 
ing the multiple scatter interactions in an array of densely packed detectors. 
For example, the configuration studied for the GENINO and EDELWEISS- 
II experiments involve a series of detector planes with hexagonal paving of 
Ge detectors (Fig. 15). For the (1 + 6) inner detectors in each detector plane, 
the rejection factor by identification of multiple scatter neutron events can 
reach two orders of magnitude. Similar rejection factors are predicted for 
the GENIUS experiment. A different strategy has been used by the GDMS 
experiment, with interspersed germanium and silicon detectors to measure 
the neutron flux on two different target materials. This identification of the 
residual background can be essential to identify the small internal produc- 
tion of neutrons (notably through the U/Th chains), for which a veto is 
obviously inefficient. Using these protection strategies, the neutron back- 
ground should probably be kept below the level of 10“"^ evt/kg/keV/day. 
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2.9.2 Surface events 

In addition to the neutron background, low-energy surface events (surface 
tritium contamination, X-ray fluorescence from surrounding materials, low 
energy recoils of heavy nuclei associated with alpha surface contamination) 
will probably constitute one of the main problems that WIMP direct detec- 
tion experiments will have to face. Incomplete charge collection for events 
occurring in the “dead layer” of charge-phonon detectors have been ob- 
served by the CDMS and EDELWEISS experiments [73,74]. Despite a 
more efficient charge collection through the use of an amorphous silicon 
layer on the surface of the detectors [73], this problem still represents the 
major limitation of this otherwise extremely efficient discrimination tech- 
nique. Like WIMP interactions, these low energy events generally give a 
signal in only one detector and the anticoincidence strategy, developed for 
example in the HDMS experiment [39], cannot be used. In conventional 
germanium detectors, a large part of the detector surface is usually passi- 
vated and insensitive to these surface events. But the observation of SUSY 
WIMPs requires such an increase of sensitivity over the best existing per- 
formances that these low-energy surface events are probably the most dan- 
gerous background for these non-discriminating detectors. An indication 
that surface events may severely limit the performances of the ambitious 
GENIUS experiment is given by the HDMS detector, presently experi- 
menting with an anti-Compton strategy by using a germanium crystal sur- 
rounded by a second larger we 11- type germanium detector, and which is 
observing a background reduction factor of 4 instead of the factor 20 ini- 
tially expected. 

Similarly, the anomalous events observed by the UKDMC and Saclay 
Nal experiments [68-71] are most likely due to low-energy surface nuclear 
recoils [74, 121] and represent one of the main limitations of this type of 
discriminating experiments. The only technique appearing at present free 
from surface effects is the scintillation-heat discrimination technique devel- 
oped by the Munich group [80] , but this technique has yet to be tested in a 
real-scale low background experiment. 



2.9.3 Main detector strategies 

What are the detector techniques which could give rise to realistic experi- 
ments in the one ton range, required by the SUSY models cross-sections? 
The DAMA group has proposed increasing its Nal detector mass to one ton, 
and is presently preparing its 250 kg stage, with objective to confirm the 
evidence for annual modulation. However, the present systematic limita- 
tions of this experiment make it improbable that significant improvements 
in sensitivity can be reached by this technique. 
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The GENIUS project, on the other hand, has also proposed a one ton 
experiment, with principal aim to study the Q-v double beta decay, using 
classical germanium detectors. Based on the experience developed over sev- 
eral years in the Heidelberg-Moscow and HDMS experiments [37, 39] , the 
GENIUS project uses a series of germanium detectors immersed in liquid 
nitrogen, since both these components can be made extremely pure radioac- 
tively. 

The major limitation of the GENIUS proposal lies in its present lack of 
discrimination between nuclear and electron recoils. As noted previously, 
low energy surface events probably limit already the performances of the 
HDMS experiment. In addition, at the levels considered here, the double 
beta decay with emission of two neutrinos, a second-order weak interaction 
process, will represent an important source of background extending down 
to low energies. Another radioactive background of concern, specific to the 
germanium detector, is due to the presence of the radioactive ®®Ge isotope 
cosmogenically produced at ground level during production. This radioac- 
tive contaminant cannot be separated by the usual purification techniques, 
and germanium crystals will have to be produced in an underground site to 
reduce this activation at the level required by GENIUS. Using ^®Ge enriched 
crystals will strongly reduce the production of ®®Ge, but in the absence of 
event discrimination, a precise knowledge of the various background sources 
that need to be subtracted will be required in order to extract the WIMP 
signal. Based on past analysis of the Heidelberg-Moscow experiment, it 
seems doubtful that a significant sensitivity increase can be reached from 
the background subtraction procedure. 

A completely different strategy is proposed by the cryogenic experi- 
ments. The emphasis here is not so much on the radioactive purity but 
on the quality of the discrimination between electron and nuclear recoils to 
identify a possible WIMP signal. During the next few years, the GDMS-H, 
GRESST-H and EDELWEISS-H experiments will each use a mass of detec- 
tors in the 10 kg range. This should allow a comparison between the various 
discrimination strategies (simultaneous measurement of fast phonons and 
charge, athermal phonons and light, or thermal phonons and charge) while 
already testing a significant part of the SUSY allowed models. Larger scale 
experiments, if they are required, will then use the best strategy to explore 
the remaining part of the available phase space of SUSY models. 

In this respect, cooling down a few tons of material to a temperature 
of 100 mK has already been achieved by the NAUTILUS cryogenic search 
for gravitational waves [122]. On the other hand, cryogenic experiments 
still have to demonstrate that they can reliably operate very large num- 
bers of detectors. The MIBETA experiment [50,51] has already successfully 
operated twenty 340 g Te02 crystals, « 7 kg of detectors, and realized 
the largest cryogenic experiment for dark matter and double beta decay 
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search. In a next stage, the CUORE project [123] intends to use a thousand 
750 g Te02 crystal detectors at a temperature of 10 mK, almost reaching 
the one ton objective. Although the CUORE project, principally aimed 
at the neutrino mass measurement, will hardly be competitive with future 
dark matter searches due to its present lack of discrimination capabilities, 
this experiment shows that, based on present experience, large numbers of 
detectors at very low temperatures can be considered and realized. For 
such large-scale experiments, ease of fabrication and reliability represent 
two essential factors. In this respect, thin film sensors offer an elegant so- 
lution to the problems of reproducibility and channel multiplicity. Several 
groups [47,48,76,83] have developed such sensors with sensitivities com- 
parable or even exceeding that of conventional sensors such as Neutron 
Transmutation Doped sensors (NTDs), and it does not seem unrealistic to 
consider a cryogenic experiment of several hundred detectors. 

2.10 Conclusions 

The present generation of direct detection experiments is beginning to ex- 
plore the parameter space of realistic SUSY models. In order to explore a 
significant part of the allowed SUSY models with a reasonable possibility of 
identifying a WIMP signal, a strongly discriminating technique separating 
the nuclear recoils from the important background of electron recoils appears 
highly desirable. The only experiment without discrimination claiming per- 
formances commensurate with such sensitivities, the GENIUS experiment, 
requires an extrapolation of its present performances by nearly four orders 
of magnitude in terms of residual background. Therefore, it seems necessary 
to test the validity of this non-discriminating approach by looking at the 
performances of the HDMS experiment, using an anti-Compton strategy. 
Although impressive, the present performances of the HDMS experiment 
are still far from achieving its initial rejection goals. This may be the in- 
dication that the limitations encountered by the CDMS and EDELWEISS 
experiments with surface events will also be the main limitation encoun- 
tered by the GENIUS project and strongly limit its sensitivity at the low 
energies required to observe a WIMP signal. 

More pragmatically, the CDMS-II, EDELWEISS-II and CRESST-II cry- 
ogenic experiments with background discrimination, and the HDMS and 
GENING classical germanium experiments, will test these alternative 
strategies in the next few years. 

3 Axions 

Axions represent a completely different solution to the Dark Matter prob- 
lem. Their theoretical justification lies in the so-called “strong CP prob- 
lem” [124]. In the Standard model, CP violation may appear in electroweak 
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interactions through the existence of a complex phase in the CKM matrix. 
On the other hand, in strong interactions, the QCD lagrangian which was 
initially written as: 

L = - Mg)q 

a priori requires an additional term related to the existence of instanton 
solutions in non abelian gauge theories: 



9 






But since the neutron electric dipole moment is experimentally very small: 



d < 1 X 10 ecm [125], 



the dimensionless 6 parameter must be less than « 10“^°, which seems quite 
unnatural and corresponds to the strong CP problem. To solve this fine tun- 
ing conundrum, the idea is then to introduce a dynamical axion field so that 
the 9 parameter is driven to a very small value. This dynamical mechanism, 
associated to a coherent emission of axions at zero momentum in the very 
early universe, can be understood geometrically in the “tilted mexican hat” 
representation of a dynamical symmetry breaking. Accordingly, the 9 pa- 
rameter can be expressed as the ratio 9 = a/ F^,, where Aa is called the axion 
decay constant. 

For this mechanism to work, the axion field must have a kinetic energy 
term and no potential term except for the anomaly term. A Goldstone 
boson with an anomaly term, the so-called Peccei-Quinn mechanism [126], 
then represents a solution to this problem. 

The precise nature of the coupling of axions to ordinary particles is 
model dependent, however [127]. The two generic class of axions which are 
usually considered are the KSVZ model [128], or heavy quark model, where 
the heavy quark acquires its mass through a singlet complex Higgs field, and 
coupling the axion to quarks at tree level, and the DFSZ model [129], where 
the lagrangian includes, in addition, a term corresponding to an axion- 
lepton-coupling. It should be noted that in superstring theories, both KSVZ 
and DFSZ aspects are expected to be present if the Peccei-Quinn symmetry 
is spontaneously broken. 

Independently of the solution to the strong CP problem, it was later re- 
alized that axions represented a possible dark matter candidate in the mass 
range between approximately 10“® to 10“^ eV. Outside this mass interval, 
axions are, except for some possible narrow windows, strongly constrained 
by astrophysical and cosmological arguments. A severe astrophysical bound 
comes from the fact that axions produced at the core of a star must not 
represent a too efficient way of cooling. A conventional cooling mechanism 
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of the core is provided by neutrino escape, and axion cooling must remain 
consistent with our knowledge of stellar evolution and energy production. 
The most severe constraint in this respect comes from the observation of 
the time structure (in the lO’s of seconds range) of neutrino emission in 
supernova SN1987A. A strong axion energy loss would then reduce the du- 
ration of the neutrino emission. This constraint results in an upper bound 
on axion-photon coupling, and a corresponding lower bound on the axion 
decay constant Ua. For very small couplings, axions, although allowed to 
escape the supernova core, are produced in insufficient quantities to present 
a contradiction with existing data. 

Two further bounds come from the recent analysis of helioseismological 
data [130], which constrains the core energy loss to axion-photon couplings 
5a7 < 1 X 10“® GeV“^, and from the globular cluster limit which represents 
an even more stringent upper bound g^-y < 0.6 x 10“^° GeV“^ on this 
coupling [131]. 

On the other hand, a lower bound on the axion-photon coupling, and 
an upper bound on the decay constant Fg, « 10^^ GeV is provided by 
the constraint that the axion energy density remains within cosmologically 
acceptable limits [132]. 

The main experimental method for detecting axions, initially proposed 
by Sikivie [133], relies on the 077 coupling which makes it possible to convert 
axions into detectable radiation in resonant electromagnetic cavities. The 
emitted power is extremely small for tractable electromagnetic fields and is 
given by the expression: 

P 10 (10 Tesla) (0.4 X 10“ g/cm») 

where V is the volume of the cavity, Bq is the magnetic field strength. 
Pa is the local axion density, Qc is the quality factor of the cavity and 
1/Qa ~ 10“® is the width of the axion energy distribution. This width is 
directly related to the expected velocity dispersion of galactic axions and 
its inverse is roughly of the same order of magnitude as the quality factors 
of electromagnetic cavities which have been developed. 

Several experiments have been developed over the last ten years, which 
appear to be able to test, for the first time, part of the available phase 
space of the axion models. Improving by two orders of magnitude the per- 
formances of previous attempts [134-136], the LLNL axion search experi- 
ment [137] has recently reached the sensitivity required to test the KSVZ 
model in a small window around 3 peV, and plans to extend its range of 
operation to the [10“®, 10“®] mass range in the next few years. Since the 
emitted power is extremely small, the main experimental limitations are the 
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thermal noise of the cavity, and the electronics noise. To fight against the 
thermal noise, the use of cryogenic systems is compulsory and the resonant 
cavity lies in a pumped helium system to lower its temperature to « 1.3 K. 
The constraint on the electronics noise is also pushed to its fundamental 
limits. Although the present stage of the US axion search is still using con- 
ventional electronics with HEMT (high electron mobility) transistors, de- 
velopments realized by the Berkeley group on SQUID-based amplifiers [138] 
will be required to extend its sensitivity down to that required to test the 
DFSZ model, which requires an additional factor of « 2.7 in sensitivity. 

A second experimental approach, developed by the Kyoto group, 
C ARRACK (Cosmic Axion Research with Rydberg Atoms in a Cavity at 
Kyoto) [139], involves the selective ionization of Rydberg atoms and the 
detection of electrons produced. These atoms are very sensitive systems 
by using n « 150 Rydberg states with energy differences between adjacent 
states in the microwave range. Additionally, the absorption rate can be 
made sufficiently fast, whereas the lifetime of the upper Rydberg level, in 
the millisecond range, is sufficiently long to allow the transport and selec- 
tive ionization of the Rydberg atoms. Still, the stability of these highly 
excited atoms requires the use of a dilution refrigerator to reach a tem- 
perature of « 10 mK in the CARRACK-II phase of the experiment. The 
C ARRACK experiment intends to exceed the sensitivity required to test, 
over the [2-30] /xeV axion mass interval, the KSVZ and DFSZ cosmologically 
relevant mass range. The present and projected sensitivity of the two main 
axion experiments, together with some of the most significant previous re- 
sults, is represented in Figure 16. It should be noted, however, that even at 
the level of the projected sensitivities, these experiments will still be unable 
to cover part of the available phase space of allowed axion solutions. 



Some groups have also tried to detect the inverse conversion of keV ax- 
ions produced inside the Sun into photons through the Primakoff conversion 
mechanism, and using conventional or cryogenic X-ray detection techniques. 
For example, the SOLAX experiment [140, 141] has tested the highly en- 
hanced conversion in oriented single crystal detectors of such solar axions. 
However, this direct detection method is at present significantly less sensi- 
tive than the limit originating from the recent helioseismological data [130], 
and even the most ambitious projects such as CUORE and GENIUS might 
just barely reach the sensitivity imposed by this indirect method. The more 
sensitive globular-cluster constraint is probably beyond reach of this direct 
detection technique. On the other hand, the Tokyo helioscope experiment 
(Fig. 17) has reached sensitivities which extend beyond the solar limit [142] 
and Zioutas et al. [143] have proposed to use a decommissioned LHC test 
magnet which could in principle reach this sensitivity. 
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Fig. 16. Axion couplings and masses excluded at a 90% confidence level by axion 
experiments and astrophysical constraints. Also shown are the KSVZ and DFSZ 
model predictions. The vertical excluded regions, from left to right, are the regions 
excluded by the microwave cavity experiments, the astrophysical bounds and the 
Tokyo helioscope experiment [134-137,142]. All results assume paxion = Phaio 
(adapted from G. Raffelt, Annu. Rev. Nucl. Part. Sci. 49 (1999) 163). 
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Fig. 17. Detection principle of the Tokyo helioscope experiment using the con- 
version of axions emitted by the Sun into keV X-rays (from [142]). 



4 Conclusions and perspectives 

For the first time, axion experiments are able to test a small window of 
the cosmologically relevant axion models. Future versions of these exper- 
iments, using different tunable cavities and SQUID readout, or detecting 
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excited Rydberg atoms, should be able to test a more significant part of 
the cosmologically relevant phase space of the KSVZ and the DFSZ axion 
models. 

Similarly, direct detection experiments are becoming sensitive enough 
to begin exploring the domain of realistic SUSY models. The recent an- 
nouncement of the observation of a first WIMP candidate by the DAMA 
experiment has triggered considerable interest and a similar amount of skep- 
ticism among the physics community. This claim should be tested in the 
near future by several experiments using very different techniques. 

Exploration of a significant fraction of the SUSY models will require, 
on the other hand, an increase in sensitivity by three to four orders of 
magnitude over the best presently achieved sensitivities, together with de- 
tector sizes in the one ton range. Several groups are already proposing 
various strategies in order to reach this ambitious goal before the end of the 
decade. The possibility of using various nuclei as target material (and, in 
particular, the extremely radiopure germanium and silicon), together with 
the possibility to reliably identify and eliminate the radioactive background, 
gives a realistic prospect to observe and constrain the properties of stable 
relic SUSY particles within the next few years. 



Stimulating discussions with B. Cabrera, R. Gaitskell, G. Gerbier, C. Goldback, M. 
Loidl, J. Mallet, B. Majorotvits, G. Nollez, J. Rich, A. Spadafora and N. Spooner are 
gratefully acknowledged. Needless to say, these people are not responsible for the errors 
or omissions contained in this paper. This work has been partially funded by the EEC- 
Network program under contract ERBFMRXCT980167. 
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INFLATION AND CREATION OF MATTER 
IN THE UNIVERSE 



A. Linde 



Abstract 

The evolution of the inflationary theory is described, starting from 
the Starobinsky model and the old inflation scenario, toward chaotic 
inflation and the theory of eternally expanding self-reproducing infla- 
tionary universe. A special attention is devoted to the recent progress 
in the theory of reheating and creation of matter after inflation. We 
also discuss the theory of gravitino and moduli production, as well as 
some recent developments and problems related to string cosmology 
and brane world scenario. 

1 Introduction 

Typical lifetime of a new trend in high energy physics and cosmology nowa- 
days is about 5 to 10 years. If it survived for a longer time, the chances 
are that it will be with us for quite a while. Inflationary theory by now 
is 20 years old, and it is still very much alive. It is the only theory which 
explains why our universe is so homogeneous, flat, and isotropic, and why 
its different parts began their expansion simultaneously. It provides a mech- 
anism explaining galaxy formation and solves numerous different problems 
at the intersection between cosmology and particle physics. It seems to 
be in a good agreement with observational data, and it does not have any 
competitors. Thus we have some reasons for optimism. 

According to the standard textbook description, inflation is a stage of 
exponential expansion in a supercooled false vacuum state formed as a re- 
sult of high temperature phase transitions in GUTs. However, during the 
last 20 years inflationary theory has changed quite substantially. New ver- 
sions of inflationary theory typically do not require any assumptions about 
initial thermal equilibrium in the early universe, supercooling and expo- 
nential expansion in the false vacuum state. Instead of it, we are thinking 
about chaotic initial conditions, quantum cosmology and the theory of a 
self-reproducing universe. 
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Inflationary theory was proposed as an attempt to resolve problems of 
the Big Bang theory. In particular, inflation provides a simple explana- 
tion of extraordinary homogeneity of the observable part of the universe. 
But it can make the universe extremely inhomogeneous on a much greater 
scale. Now we believe that instead of being a single, expanding ball of Are 
produced in the Big Bang, the universe looks like a huge growing fractal. 
It consists of many inflating balls that produce new balls, which in turn 
produce more new balls, ad infinitum. Even now we continue learning new 
things about inflationary cosmology, especially about the stage of reheating 
of the universe after inflation. 

In this paper we will briefly describe the history of inflationary cosmol- 
ogy, and then we will give a review of some recent developments. 

2 Brief history of inflation 

The first model of inflationary type was proposed by Starobinsky in 1979 [1]. 
It was based on investigation of conformal anomaly in quantum gravity. 
This model was rather complicated, it did not aim on solving homogeneity, 
horizon and monopole problems, and it was not easy to understand the 
beginning of inflation in this model. However, it did not suffer from the 
graceful exit problem, and in this sense it can be considered the first working 
model of inflation. The theory of density perturbations in this model was 
developed in 1981 by Mukhanov and Chibisov [2]. This theory does not 
differ much from the theory of density perturbations in new inflation, which 
was proposed later by Hawking et al. [3,4]. 

A much simpler model with a very clear physical motivation was pro- 
posed by Guth in 1981 [5]. His model, which is now called “old inflation”, 
was based on the theory of supercooling during the cosmological phase tran- 
sitions [6]. It was so attractive that even now all textbooks on astronomy 
and most of the popular books on cosmology describe inflation as exponen- 
tial expansion of the universe in a supercooled false vacuum state. It is 
seductively easy to explain the nature of inflation in this scenario. False 
vacuum is a metastable state without any fields or particles but with large 
energy density. Imagine a universe filled with such “heavy nothing” . When 
the universe expands, empty space remains empty, so its energy density 
does not change. The universe with a constant energy density expands 
exponentially, thus we have inflation in the false vacuum. 

Unfortunately this explanation is somewhat misleading. Expansion in 
the false vacuum in a certain sense is false: de Sitter space with a constant 
vacuum energy density can be considered either expanding, or contracting, 
or static, depending on the choice of a coordinate system [7]. The absence 
of a preferable hypersurface of decay of the false vacuum is the main reason 
why the universe after inflation in this scenario becomes very inhomogeneous 
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[5]. After many attempts to overcome this problem, it was concluded that 
the old inflation scenario cannot be improved [8] . 

Fortunately, this problem was resolved with the invention of the new 
inflationary theory [9]. In this theory, just like in the Starobinsky model, 
inflation may begin in the false vacuum. This stage of inflation is not very 
useful, but it prepares a stage for the next stage, which occurs when the 
inflaton held (j) driving inflation moves away from the false vacuum and 
slowly rolls down to the minimum of its effective potential. The motion 
of the held away from the false vacuum is of crucial importance: density 
perturbations produced during inflation are inversely proportional to 4> [2, 
3]. Thus the key difference between the new inflationary scenario and the 
old one is that the useful part of inflation in the new scenario, which is 
responsible for homogeneity of our universe, does not occur in the false 
vacuum state. 

The new inflation scenario was plagued by its own problems. This sce- 
nario works only if the effective potential of the held <j) has a very a flat 
plato near (j) = 0, which is somewhat artificial. In most versions of this sce- 
nario the inflaton field originally could not be in a thermal equilibrium with 
other matter fields. The theory of cosmological phase transitions, which 
was the basis for old and new inflation, simply did not work in such a sit- 
uation. Moreover, thermal equilibrium requires many particles interacting 
with each other. This means that new inflation could explain why our uni- 
verse was so large only if it was very large and contained many particles 
from the very beginning. Finally, inflation in this theory begins very late, 
and during the preceding epoch the universe could easily collapse or become 
so inhomogeneous that inflation may never happen [7]. Because of all these 
difficulties no realistic versions of the new inflationary universe scenario have 
been proposed so far. 

From a more general perspective, old and new inflation represented a 
substantial but incomplete modification of the Big Bang theory. It was 
still assumed that the universe was in a state of thermal equilibrium from 
the very beginning, that it was relatively homogeneous and large enough 
to survive until the beginning of inflation, and that the stage of inflation 
was just an intermediate stage of the evolution of the universe. In the be- 
ginning of the 80’s these assumptions seemed most natural and practically 
unavoidable. That is why it was so difficult to overcome a certain psycho- 
logical barrier and abandon all of these assumptions. This was done with 
the invention of the chaotic inflation scenario [10]. This scenario resolved 
all problems of old and new inflation. According to this scenario, inflation 
may occur even in the theories with simplest potentials such as V{4>) ~ 
Inflation may begin even if there was no thermal equilibrium in the early 
universe, and it may start even at the Planckian density, in which case the 
problem of initial conditions for inflation can be easily resolved [7]. 
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Chaotic inflation 



To explain the basic idea of chaotic inflation, let us consider the simplest 
model of a scalar held (j) with a mass m and with the potential energy density 
V{4>) = see Figure 1. Since this function has a minimum at (f> = 0, 

one may expect that the scalar held (j) should oscillate near this minimum. 
This is indeed the case if the universe does not expand. However, one can 
show that in a rapidly expanding universe the scalar held moves down very 
slowly, as a ball in a viscous liquid, viscosity being proportional to the speed 
of expansion. 

There are two equations which describe evolution of a homogeneous 
scalar held in our model, the held equation 

0 + 3i?0= (2.1) 



and the Einstein equation 




Stt 

3Mf 






( 2 . 2 ) 



Here H = d/ a is the Hubble parameter in the universe with a scale factor 
a{t), k = —1, 0, 1 for an open, flat or closed universe respectively, Mp is the 
Planck mass. In the case V = m^^^/2, the first equation becomes similar 
to the equation of motion for a harmonic oscillator, where instead of x{t) 
we have with a friction term 3H(j}: 

-rn^cj) . (2.3) 



If the scalar held 4> initially was large, the Hubble parameter H was large 
too, according to the second equation. This means that the friction term 
in the first equation was very large, and therefore the scalar held was mov- 
ing very slowly, as a ball in a viscous liquid. Therefore at this stage the 
energy density of the scalar held, unlike the density of ordinary matter, 
remained almost constant, and expansion of the universe continued with a 
much greater speed than in the old cosmological theory. Due to the rapid 
growth of the scale of the universe and a slow motion of the held (j), soon 
after the beginning of this regime one has 0 <C 0^ w?<jP', 

so the system of equations can be simplified: 



3-0 = —m 
a 



d 2to0 [tt 

a ^ 1^ V 3' 



(2.4) 



The last equation shows that the size of the universe in this regime grows 
approximately as e^*, where H = i/l"’ 

More exactly, these equations lead to following solutions for 0 and a: 

mMpt 



0(t) = 00 



(2.5) 
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a(t) = oo exp^((/)2 . (2.6) 

This stage of exponentially rapid expansion of the universe is called inflation. 
In realistic versions of inflationary theory its duration could be as short 
as 10“^® seconds. When the held 4> becomes sufficiently small, viscosity 
becomes small, inflation ends, and the scalar held (f) begins to oscillate near 
the minimum of V{4>). As any rapidly oscillating classical held, it looses its 
energy by creating pairs of elementary particles. These particles interact 
with each other and come to a state of thermal equilibrium with some 
temperature T . From this time on, the corresponding part of the universe 
can be described by the standard hot universe theory. 

The main difference between inflationary theory and the old cosmology 
becomes clear when one calculates the size of a typical inflationary domain 
at the end of inflation. Investigation of this question shows that even if 
the initial size of inflationary universe was as small as the Planck size l-p 
~ IQ-33 cm, after 10“^® seconds of inflation the universe acquires a huge 
size of I ~ cm! 

To understand where this estimate comes from, let us make an assump- 
tion, that initially the scalar held <j) took the largest value which is com- 
patible with the condition that the energy density of the universe is smaller 
than the Planck density, V{4>) < M^. This yields (j>o ~ Then 

equation (2.6) implies that during inflation the universe expanded approx- 
imately by exp(2 7T(()g/Mp) ~ exp(2 7rMp/m^). From the theory of density 
fluctuations to be discussed later one can deduce that the inflaton mass in 
this model should be m ~ 10“®Mp ~ 10^^ GeV. This leads to the estimate 
of the total growth of the size of the universe during inflation ~ 10^° . 

This number is model-dependent, but in all realistic models the size of 
the universe after inflation appears to be many orders of magnitude greater 
than the size of the part of the universe which we can see now, I ~ 10^® cm. 
This immediately solves most of the problems of the old cosmological theory. 

Our universe is almost exactly homogeneous on large scale because all 
inhomogeneities were stretched by a factor of 10^*^ . The density of pri- 
mordial monopoles and other undesirable “defects” becomes exponentially 
diluted by inflation. The universe becomes enormously large. Even if it was 
a closed universe of a size ~ 10“®® cm, after inflation the distance between 
its “South” and “North” poles becomes many orders of magnitude greater 
than 10^® cm. We see only a tiny part of the huge cosmic balloon. That is 
why nobody has ever seen how parallel lines cross. That is why the universe 
looks so flat. 

If one considers a universe which initially consisted of many domains 
with chaotically distributed scalar held (j) (or if one considers different uni- 
verses with different values of the held), then domains in which the scalar 
held was too small never inflated. The main contribution to the total volume 
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Fig. 1. Motion of the scalar field in the theory with V{(j>) = Several 

different regimes are possible, depending on the value of the field tj>- If the potential 
energy density of the field is greater than the Planck density Mp ~ 10®^ g/cm®, 
quantum fluctuations of space-time are so strong that one cannot describe it in 
usual terms. Such a state is called space-time foam. At a somewhat smaller 
energy density (region A: mMp < V{4>) < Mp) quantum fluctuations of space- 
time are small, but quantum fluctuations of the scalar field (f> may be large. Jumps 
of the scalar field due to quantum fluctuations lead to a process of eternal self- 
reproduction of inflationary universe which we are going to discuss later. At even 
smaller values of V (4>) (region B: rri^Mp < V (0) < mMp ) fluctuations of the field 
(j> are small; it slowly moves down as a ball in a viscous liquid. Inflation occurs 
both in the region A and region B. Finally, near the minimum of U(<(>) (region C) 
the scalar field rapidly oscillates, creates pairs of elementary particles, and the 
universe becomes hot. 



of the universe will be given by those domains which originally contained 
large scalar field (f>. Inflation of such domains creates huge homogeneous 
islands out of initial chaos. Each homogeneous domain in this scenario is 
much greater than the size of the observable part of the universe. 

The first models of chaotic inflation were based on the theories with 
polynomial potentials, such as V{(j)) = + j4>'^ . But the main idea of 

this scenario is quite generic. One should consider any particular potential 
V{(j)), polynomial or not, with or without spontaneous symmetry breaking, 
and study all possible initial conditions without assuming that the universe 
was in a state of thermal equilibrium, and that the held (j) was in the min- 
imum of its effective potential from the very beginning [10]. This scenario 
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strongly deviated from the standard lore of the hot Big Bang theory and 
was psychologically difficult to accept. Therefore during the first few years 
after invention of chaotic inflation many authors claimed that the idea of 
chaotic initial conditions is unnatural, and made attempts to realize the 
new inflation scenario based on the theory of high-temperature phase tran- 
sitions, despite numerous problems associated with it. Gradually, however, 
it became clear that the idea of chaotic initial conditions is most general, 
and it is much easier to construct a consistent cosmological theory with- 
out making unnecessary assumptions about thermal equilibrium and high 
temperature phase transitions in the early universe. 

Many other versions of inflationary cosmology have been proposed since 
1983. Most of them are based not on the theory of high-temperature phase 
transitions, as in old and new inflation, but on the idea of chaotic initial 
conditions, which is the definitive feature of the chaotic inflation scenario. 

3 Quantum fluctuations in the inflationary universe 

The vacuum structure in the exponentially expanding universe is much more 
complicated than in ordinary Minkowski space. The wavelengths of all 
vacuum fluctuations of the scalar held (j) grow exponentially during inflation. 
When the wavelength of any particular fluctuation becomes greater than 
H~^, this fluctuation stops oscillating, and its amplitude freezes at some 
nonzero value 6(f>{x) because of the large friction term 3iL</> in the equation of 
motion of the held 4>. The amplitude of this fluctuation then remains almost 
unchanged for a very long time, whereas its wavelength grows exponentially. 
Therefore, the appearance of such a frozen fluctuation is equivalent to the 
appearance of a classical held 5(j){x) that does not vanish after averaging 
over macroscopic intervals of space and time. 

Because the vacuum contains fluctuations of all wavelengths, inflation 
leads to the continuous creation of new perturbations of the classical held 
with wavelengths greater than i.e. with momentum k smaller than H . 
One can easily understand on dimensional grounds that the average ampli- 
tude of perturbations with momentum k ^ H is 0{H). A more accurate 
investigation shows that the average amplitude of perturbations generated 
during a time interval H~^ (in which the universe expands by a factor of e) 
is given by [7] 

|<5</)(x)| « (3.1) 

Some of the most important features of inflationary cosmology can be un- 
derstood only with an account taken of these quantum fluctuations. That is 
why in this section we will discuss this issue. We will begin this discussion 
on a rather formal level, and then we will suggest a simple interpretation of 
our results. 




350 



The Primordial Universe 



First of all, we will describe inflationary universe with the help of the 
metric of a flat de Sitter space, 

ds2 = dt2-e2«Mx2. (3.2) 



We will assume that the Hubble constant H practically does not change 
during the process, and for simplicity we will begin with investigation of a 
massless held (f>. 

To quantize the massless scalar held (j) in de Sitter space in the coordi- 
nates (3.2) in much the same way as in Minkowski space [11]. The scalar 
held operator <j){x) can be represented in the form 

J d^p[a+'tpp(t)e^P^ + a~ (3.3) 

where tpp(t) satisfies the equation 

■ippit) + 5Hipp{t) + e~'^ ^ * ^pp{t) = 0. (3.4) 



The term 3 Hipp{t) originates from the term 3H(j) in equation (2.1), the last 
term appears because of the gradient term in the Klein-Gordon equation 
for the held 4>. Note, that p is a comoving momentum, which, just like the 
coordinates x, does not change when the universe expands. 



1 



In Minkowski space, %pp{t)—=e where p = ■ In de Sitter space 



(3.2), the general solution of (3.4) takes the form 



V'p(i) = ^ ^ + C2(p) H^%{pr])], (3.5) 

where rj = is the conformal time, and the hI% Hankel 

functions: 



= -\/Te— (l + T) . (3.6) 

Quantization in de Sitter space and Minkowski space should be identical 
in the high-frequency limit, i.e., C\{p) — > 0, C* 2 (p) ^ — 1 as p ^ oo. In 
particular, this condition is satisfied^ for Ci = 0, C 2 = — 1. In that case. 



^It is important that if the inflationary stage is long enough, all physical results are 
independent of the speciflc choice of functions C'i(p) and C 2 {p) if Ci{p) — > 0, C 2 {p) — > — 1 
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Notice that at sufficiently large t (when pe < H), tpp{t) ceases to oscil- 

iH 

late, and becomes equal to 

pV^ 

The quantity {(p) may be simply expressed in terms of 'ipp: 

/ 1 *"'" / (^ + 1 ?) 

The physical meaning of this result becomes clear when one transforms from 
the conformal momentum p, which is time-independent, to the conventional 
physical momentum k = pe~^^, which decreases as the universe expands: 

,, 2 , 1 fd^k fl H^\ 

) - TTZvf / O + 5T2 ) ■ (3-9) 



The first term is the usual contribution of vacuum fluctuations in Minkowski 



space with H = 0. This contribution can be eliminated by renormalization. 
The second term, however, is directly related to inflation. Looked at from 
the standpoint of quantization in Minkowski space, this term arises because 
of the fact that de Sitter space, apart from the usual quantum fluctuations 
that are present when H = 0, also contains ^particles with occupation 
numbers 

uk = (3.10) 

It can be seen from (3.9) that the contribution to {(jP) from long-wave 
fluctuations of the 4> field diverges. 

However, the value of for a massless field <f> is infinite only in eter- 
nally existing de Sitter space with H = const, and not in the inflationary 
universe, which expands (quasi) exponentially starting at some time t = 0 
(for example, when the density of the universe becomes smaller than the 
Planck density). Indeed, the spectrum of vacuum fluctuations (3.9) strongly 
differs from the spectrum in Minkowski space when k H . If the fluctu- 
ation spectrum before inflation has a cutoff at fc < fcg ~ T resulting from 
high-temperature effects, or at k < ko ^ H due to a small initial size ~ H~^ 
of an inflationary region, then the spectrum will change at the time of infla- 
tion, due to exponential growth in the wavelength of vacuum fluctuations. 
The spectrum (3.9) will gradually be established, but only at momenta 
k > koB~^^. There will then be a cutoff in the integral (3.8). Restricting 
our attention to contributions made by long-wave fluctuations with k < H, 
which are the only ones that will subsequently be important for us, and 
assuming that ko = O(iL), we obtain 



(</>") 



(Pk r° k 

2{2irY k - J_Ht H 



47T^ 





(3.11) 




352 



The Primordial Universe 



A similar result is obtained for a massive scalar field <j). In that case, long- 
wave fluctuations with m? behave as 






8 7T^ m? 



1 — exp 




(3.12) 



When t < — the term icjp) grows linearly, just as in the case of the 
massless field (3.11), and it then tends to its asymptotic value 






3i/4 

8 TT^ 



(3.13) 



Let us now try to provide an intuitive physical interpretation of these results. 
First, note that the main contribution to {4>'^) (3.11) comes from integrating 
over exponentially small k (with k ^ H exp(-Ht)). The corresponding 
occupation numbers Uk (3.10) are then exponentially large. One can show 
that for large I = |x — y| e^‘, the correlation function {(j){x) 4>{y)) for the 
massless field </> is 



((/>(x,t) <p{y,t)) 



((()2(x,t)) 



1 InH 

Ht 




(3.14) 



This means that the magnitudes of the fields (j>{x) and (j>{y) will be highly 
correlated out to exponentially large separations I ~ exp{H t), and the 
corresponding occupation numbers will be exponentially large. By all these 
criteria, long- wave quantum fluctuations of the field (j) with k H~^ behave 
like a weakly inhomogeneous (quasi)classical field (j) generated during the 
inflationary stage. 

Analogous results also hold for a massive field with m? <C ■ There, 
the principal contribution to comes from modes with exponentially 
small momenta k ^ H exp (— 3iL^/2m^), and the correlation length is of 
order exp (3 iJ^/2 m^) . 

Later on we will develop a stochastic formalism which will allow us to 
describe various properties of the motion of the scalar field. 



4 Quantum fluctuations and density perturbations 

Fluctuations of the field (p lead to adiabatic density perturbations 
Sp ~ V'{(j))S(l>, which grow after inflation. The theory of inflationary den- 
sity perturbations is rather complicated, but one can make an estimate of 
their post-infiationary magnitude in the following intuitively simple way: 
fiuctuations of the scalar field lead to a local delay of the end of inflation 
by the time St ~ Sp/p. Density of the universe after inflation decreases as 
so the local time delay St leads to density contrast \Sp/p\ ~ \2St/t\. 
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If one takes into account that S(f> ~ iL/27r and that at the end of inflation 
~ iL, one obtains an estimate 



5_i ^ 

P 2TTcj) 



(4.1) 



Needless to say, this is a very rough estimate. Fortunately, however, it gives 
a very good approximation to the correct result which can be obtained by 
much more complicated methods [2-4,7]: 



P 2TT(j)' 



(4.2) 



where the parameter C depends on equation of state of the universe. For 
example, (7 = 6/5 for the universe dominated by cold dark matter [4]. Then 
equations 3i70 = V and = SttV/SM^ imply that 



Sp y‘^1'^ 

~p ^ 5 V 



Here <j) is the value of the classical held (j>{t) (4), at which the fluctuation 
we consider has the wavelength I ~ k~^ ~ and becomes frozen in 

amplitude. In the simplest theory of the massive scalar held with = 

^(j)^ one has 



5p 



P 




(4.4) 



Taking into account (2.4) and also the expansion of the universe by about 
10^° times after the end of inflation, one can obtain the following result for 
the density perturbations with the wavelength I (cm) at the moment when 
these perturbations begin growing and the process of the galaxy formation 
starts: 

— ~mln/(cm). (4.5) 

P 

The definition of ^ used in [7] corresponds to COBE data for ^ ~ 5 x 10“®. 
This gives m ~ 10“®, in Planck units, which is equivalent to 10^^ GeV. 

An important feature of the spectrum of density perturbations is its 
flatness: ^ in our model depends on the scale I only logarithmically. Flat- 
ness of the spectrum of ^ together with flatness of the universe (H = 1) 
constitute the two most robust predictions of inflationary cosmology. It is 
possible to construct models where ^ changes in a very peculiar way, and 
it is also possible to construct theories where H yf 1, but it is not easy to 
do so. 
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5 Initial conditions for inflation 



Now we are going to analyse what kind of initial conditions lead to inflation. 
Since the only natural parameter of dimension of length which appears 
in quantum cosmology is it will help us in this section to measure 

everything in Planck units, so we will take M~^ = 1 in this section. Let us 
consider first a closed Universe of initial size ^ ~ 1 (in Planck units), which 
emerges from the space-time foam, or from singularity, or from “nothing” 
in a state with the Planck density p ~ 1 (z. e. p ~ -^p)- Only starting 
from this moment, i.e. at p < 1, can we describe this domain as a classical 
Universe. Thus, at this initial moment the sum of the kinetic energy density, 
gradient energy density, and the potential energy density is of the order 
unity: + U(0) ~ 1. 

We wish to emphasize, that there are no a priori constraints on the 
initial value of the scalar held in this domain, except for the constraint 
\(j)^ + \{di<f)Y + V {(j)) ~ 1. Let us consider for a moment a theory with 
V ((/>) = const. This theory is invariant under the shift (j) <j)+a. Therefore, 

in such a theory all initial values of the homogeneous component of the scalar 
held (j) are equally probable. Note, that this expectation would be incorrect 
if the scalar held should vanish at the boundaries of the original domain. 
Then the constraint < 1 would tell us that the scalar held cannot 

be greater than 1 inside a domain of initial size 1. However, if the original 
domain is a closed Universe, then it has no boundaries. (We will discuss a 
more general case shortly.) 

The only constraint on the average amplitude of the held appears if 
the effective potential is not constant, but grows and becomes greater than 
the Planck density a,t (j> > (j)p, where U(^p) = 1. This constraint implies 
that (j) < (j)p, but it does not give any reason to expect that (p (j)p. This 
suggests that the typical initial value (po of the held (p in such a theory is 

~ </>p- 

Thus, we expect that typical initial conditions correspond to ^0^ ~ 
^{di<pY ~ U(0) = 0(1)- Note that if by any chance ^0^ -I- ^(5^0)^ < V{<p) 
in the domain under consideration, then inflation begins, and within the 
Planck time the terms ^0^ and ^(5^0)^ become much smaller than U(0), 
which ensures continuation of inflation. It seems therefore that chaotic 
inflation occurs under rather natural initial conditions, if it can begin at 
U(0) ~ 1 [7]. 

The assumption that inflation may begin at a very large 0 = 0o has 
important implications. For example, in the theory m^0^/2 one has 0o ~ 
0p ~ Let us consider for definiteness a closed Universe of a typical 

initial size 0(1). Then, according to (2.6), the total size of the Universe 
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after inflation becomes equal to 

I ~ exp (27T02) ~ exp • (5.1) 

For m ~ 10“®, which is necessary to produce density perturbations ^ 

~ 10“®, one has I ~ exp (27 t 10^^) > 10^° cm. Thus, according to this es- 
timate, the smallest possible domain of the initial size 0{M~^) ~ 10“^^ cm 
after inflation becomes much larger than the size of the observable part of 
the Universe ~ 10^® cm. This is the reason why our part of the Universe 
looks flat, homogeneous and isotropic. 

Assume that the Universe is not closed but infinite, or at least extremely 
large from the very beginning. (This objection does not apply to the closed 
Universe scenario discussed above.) In this case one could argue that our 
expectation that (()o ~ 3> 1 is not very natural. Indeed, the conditions 

< 1 and </>o ~ ^ 1 imply that the held (f) should be of the same 

order of magnitude ~ (()p 1 on a length scale at least as large as <j)p, 

which is much larger than the scale of horizon ? ~ 1 at the Planck time. 
But this is highly improbable, since initially {i.e., at the Planck time) there 
should be no correlation between values of the held (j) in different regions of 
the Universe separated from one another by distances greater than 1. The 
existence of such correlation would violate causality. 

The answer to this objection is very simple [7]. We have absolutely no 
reason to expect that the overall energy density p simultaneously becomes 
smaller than the Planck energy density in all causally disconnected regions 
of an infinite Universe, since that would imply the existence of an acausal 
correlation between values of p in different domains of Planckian size ^p ~ 1. 
Thus, each such domain at the Planck time after its creation looks like an 
isolated island of classical space-time, which emerges from the space-time 
foam independently of other such islands. During inflation, each of these 
islands independently acquires a size many orders of magnitude larger than 
the size of the observable part of the Universe. A typical initial size of 
a domain of classical space-time with p < 1 is of the order of the Planck 
length. Outside each of these domains the condition p < 1 no longer holds, 
and there is no correlation between values of the held (j) in different dis- 
connected regions of classical space-time of size 1. But such correlation is 
not necessary at all for the realization of the inflationary Universe scenario: 
according to the “no hair” theorem for de Sitter space, a sufficient condition 
for the existence of an inflationary region of the Universe is that inflation 
takes place inside a region whose size is of order H~^ . In our case this 
condition is satisfied. 

We wish to emphasize once again that the confusion discussed above, 
involving the correlation between values of the held (j) in different causally 
disconnected regions of the Universe, is rooted in the familiar notion of a 
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very large Universe that is instantaneously created from a singular state 
with p = oo, and instantaneously passes through a state with the Planck 
density p = 1. The lack of justification for such a notion is the very essence 
of the horizon problem. Now, having disposed of the horizon problem with 
the aid of the inflationary Universe scenario, we can possibly manage to 
familiarize ourselves with a different picture of the Universe. In this picture 
the simultaneous creation of the whole Universe is possible only if its initial 
size is of the order 1, in which case no long-range correlations appear. Initial 
conditions should be formulated at the Planck time and on the Planck scale. 
Within each Planck-size island of the classical space-time, the initial spatial 
distribution of the scalar field cannot be very irregular due to the constraint 
{di(pY < 1- But this does not impose any constraints on the average values 
of the scalar field 4> in each of such domains. One should examine all possible 
values of the field <j) and check whether they could lead to inflation. 

The arguments given above suggest that initial conditions for inflation 
are quite natural if inflation can begin close to the Planck density. These 
arguments do not give any support to the models where inflation is possi- 
ble only at densities much smaller than 1. And indeed, an investigation of 
this question shows, for example, that a typical closed Universe where in- 
flation is possible only at U(^) <C 1 collapses before inflation begins. Thus, 
inflationary models of that type require fine-tuned initial conditions, and 
apparently cannot solve the flatness problem. 

Does this mean that we should forget all models where inflation may 
occur only at <C 1? As we will argue later on, it may be possible 

to rescue such models within the context of the theory of eternal inflation, 
which we are going to elaborate. 

But before doing so, we will briefly mention another approach to the 
problem of initial conditions for inflation based on investigation of the wave 
function of the universe and the idea of quantum creation of inflationary 
universe “from nothing” . 

According to classical cosmology, the universe appeared from a singu- 
larity in a state of infinite density. Of course, when the density was greater 
than the Planck density one could not trust the classical Einstein equa- 
tions, but in many cases there is no demonstrated need to study the cre- 
ation of the universe using the methods of quantum theory. For example, 
in the simplest versions of the chaotic inflation scenario [10], the process of 
inflation, at the classical level, could begin directly in the initial singularity. 
However, in certain models, such as the Starobinsky model [1] or the new 
inflationary universe scenario [9], inflation cannot start in a state of infinite 
density. In such cases one may speculate about the possibility that the 
inflationary universe appears due to quantum tunneling “from nothing” . 

The first idea as to how one can describe the creation of an inflationary 
universe “from nothing” was given in 1981 by Zeldovich [12] in application to 
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the Starobinsky model [1]. His idea was qualitatively correct, but he did not 
propose any quantitative description of this process. A very important step 
in this direction was made in 1982 by Vilenkin [13]. He suggested that the 
probability of quantum creation of the universe is given by exp(— 25'), where 

S = is the Euclidean action on de Sitter space with the energy 

density V{(j>). Vilenkin suggested interpreting this Euclidean solution as 
the tunneling trajectory describing the creation of a universe with the scale 

factor a = H~^ = \J from the state with a = 0. This would imply that 
the probability of the quantum creation of the universe is given by 

/ 3M^ \ 

~ j ^ 

A year later this result was reproduced by a different method by Hartle and 
Hawking [14]. They obtained this result by taking a square of the wave 
function of the ground state of the universe. Phh ~ We will call 

this probability distribution Phh- 

There were many problems in both derivations of this result. Vilenkin 
obtained it by considering tunneling of a scale factor with negative kinetic 
energy through the negative potential barrier from the state containing no 
initial state. Meanwhile Hartle and Hawking obtained their result using 
an analogy between the universe and a harmonic oscillator. However, as 
explained in [7], their method of finding the wave function of the ground 
state dTo works only for the states with positive energy. This condition, 
which was crucial for the derivation of their result, may not seem restrictive. 
Indeed, it is satisfied for all usual physical systems. However, it is well known 
that the energy of the scale factor of the universe is negative. Thus, in my 
opinion, there is no consistent mathematical derivation of equation (5.2). 
Moreover, the interpretation of this equation is also questionable. It was 
derived in [14] as an expression for the ground state of the universe, which 
should be its final state. Then the probability distribution Phh is in fact 
quite natural: it says that eventually the held (j) should wind up in the 
minimum of V ((/)); the probability to And the field f away from the minimum 
should be exponentially suppressed. Unfortunately, for some reason that I 
do not understand Phh was interpreted not in this obvious way, but as a 
probability of creation of the universe from nothing, i.e. as a probability 
for the initial state of the universe. This ambiguity will be important for 
us later, when we will re-derive this equation using the stochastic approach 
to inflation. In this approach the interpretation of this equation will be 
quite unambiguous. We believe that this equation, in those few cases where 
it is valid, describes the probability distribution for the final state of the 
universe, rather than the probability of initial conditions [7]. 
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Soon after the publication of the papers by Vilenkin and Hartle and 
Hawking, the problems mentioned above were revealed, and a different prob- 
ability distribution was proposed: 

^ , / 3M4 \ 

(«) 

I obtained this probability distribution in 1983 by two different methods. 
The first method was to change the overall sign of the effective Lagrangian 
of the scale factor, in order to make its kinetic term canonically normalized 
(note that it does not change the Lagrange equations). Tunneling in this 
theory looks like a usual quantum mechanical motion through the potential 
barrier. The second derivation (the one which I decided to publish, because 
Zeldovich initially strongly objected to the first one) was based on a different 
way of performing the Wick rotation, which made it possible to find 'I'o 
despite the fact that the energy of the scale factor is negative [15]. 

Additional insight into this matter was given by Starobinsky, who ob- 
tained the same result by an extended version of the first of my methods, 
and then together with Zeldovich applied this method to the study of the 
creation of universes with non-trivial topology [16]. Rubakov used a similar 
method, but argued that particle production during the tunneling will lead 
to substantial modification of equation (5.3) [17]. Later equation (5.3) was 
also obtained by Vilenkin [18]. 

The two different derivations give the same expression for the probability 
of universe formation Pt, up to the subexponential factor. We will ignore 
this issue in our discussion, and concentrate on the exponents. In fact, 
the only thing which we want to know is whether it is more probable to 
create the universe with a large U(<(') but a small total mass, or with a 
small U(</>) and a large total mass. Indeed, the total energy of matter in a 
closed de Sitter space, at the moment when it is created having size is 

proportional to its volume multiplied by its energy density, E ~ H~^V ~ 
Mp/-\/U. Therefore the whole de Sitter space will contain matter with a 
total energy 0{Mp) if it is created at U ~ M^. Such an event may easily 
occur within a time At ~ E~^ ~ Meanwhile creation of the universe 

with V <C Mp would require an enormous fluctuation of energy, which does 
not seem probable. This is another way to understand the physical meaning 
of equation (5.3). 

Thus the debate about the choice of different expressions is actually the 
debate between the people who believe that it is easier to create a small 
universe [15-18] and those who believe that it is easier to create a huge 
universe [14]. In my opinion, the answer to this question is obvious, but 
since we are discussing the “miracle of creation” , one can always argue that 
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a big miracle is easier than a small miracle: as a result, this debate has 
continued for 15 years. Perhaps it is not surprising taking into account 
rather esoteric character of the discussion. 

In what follows I will try to look at this subject from an alternative 
point of view, based on the stochastic approach to inflation. Within this 
approach equations (5.2) and (5.3) can be derived in a much more clear 
and rigorous way, but they will have a somewhat different interpretation. 
Moreover, I will argue that in the theory of eternal inflation, which I am 
going to discuss now, the whole issue of initial conditions and the choice 
between Phh and Pt may be irrelevant. 



6 From the Big Bang theory to the theory of eternal inflation 

A significant step in the development of inflationary theory which I would 
like to discuss here is the discovery of the process of self-reproduction of 
inflationary universe. This process was known to exist in old inflation- 
ary theory [5] and in the new one [19], but it is especially surprising and 
leads to most profound consequences in the context of the chaotic infla- 
tion scenario [20]. It appears that in many models large scalar held during 
inflation produces large quantum fluctuations which may locally increase 
the value of the scalar held in some parts of the universe. These regions 
expand at a greater rate than their parent domains, and quantum fluctu- 
ations inside them lead to production of new inflationary domains which 
expand even faster. This surprising behavior leads to an eternal process of 
self-reproduction of the universe. 

To understand the mechanism of self-reproduction one should remem- 
ber that the processes separated by distances I greater than proceed 
independently of one another. This is so because during exponential expan- 
sion the distance between any two objects separated by more than H~^ is 
growing with a speed exceeding the speed of light. As a result, an observer 
in the inflationary universe can see only the processes occurring inside the 
horizon of the radius 

An important consequence of this general result is that the process of 
inflation in any spatial domain of radius H~^ occurs independently of any 
events outside it. In this sense any inflationary domain of initial radius 
exceeding H~^ can be considered as a separate mini-universe. 

To investigate the behavior of such a mini-universe, with an account 
taken of quantum fluctuations, let us consider an inflationary domain of 
initial radius H~^ containing sufficiently homogeneous held with initial 
value (j) 3> Mp. Equation (2.4) implies that during a typical time inter- 

val At = H~^ the held inside this domain will be reduced by 

By comparison this expression with |<5^i(a:)| ~ ^ = \J 3 ^m 2 ~ one 
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can easily see that if (j) is much less than (jf ~ then the decrease 

of the field <f> due to its classical motion is much greater than the average 
amplitude of the quantum fluctuations 6(f> generated during the same time. 
But for (j) ^ (j)* one has 6(j){x) ^ A<j). Because the typical wavelength 
of the fluctuations S(j){x) generated during the time is H~^, the whole do- 
main after At = H~^ effectively becomes divided into e^ ~ 20 separate 
domains (mini-universes) of radius each containing almost homoge- 

neous field (j) — A(f) + S(j). In almost a half of these domains the field 4> grows 
by \6(f>{x)\ — A(f) « |(5(/)(a;)| = H/2tt, rather than decreases. This means 
that the total volume of the universe containing growing field (j) increases 10 
times. During the next time interval At = H~^ the situates repeats. Thus, 
after the two time intervals H~^ the total volume of the universe containing 
the growing scalar field increases 100 times, etc. The universe enters eternal 
process of self-reproduction. 

This effect is very unusual. Its investigation still brings us new un- 
expected results. For example, for a long time it was believed that self- 
reproduction in the chaotic inflation scenario can occur only if the scalar 
field <j) is greater than (f>* [20]. However, it was shown in [24] that if the 
size of the initial inflationary domain is large enough, then the process of 
self-reproduction of the universe begins for all values of the field (j) for which 
inflation is possible (for (j) > Mp in the theory ^(jp). This result is based 
on the investigation of the probability of quantum jumps with amplitude 

S(j) ^ H/2tt. 

Until now we have considered the simplest inflationary model with only 
one scalar held, which had only one minimum of its potential energy. Mean- 
while, realistic models of elementary particles propound many kinds of scalar 
fields. For example, in the unified theories of weak, strong and electromag- 
netic interactions, at least two other scalar fields exist. The potential energy 
of these scalar fields may have several different minima. This means that 
the same theory may have different “vacuum states” , corresponding to dif- 
ferent types of symmetry breaking between fundamental interactions, and, 
as a result, to different laws of low-energy physics. 

As a result of quantum jumps of the scalar fields during inflation, the 
universe may become divided into infinitely many exponentially large do- 
mains that have different laws of low-energy physics. Note that this division 
occurs even if the whole universe originally began in the same state, corre- 
sponding to one particular minimum of potential energy. 

To illustrate this scenario, we present here the results of computer simu- 
lations of evolution of a system of two scalar fields during inflation. 
The held (j) is the inflaton held driving inflation; it is shown by the height of 
the distribution of the held (f>{x, y) in a two-dimensional slice of the universe. 
The held x determines the type of spontaneous symmetry breaking which 
may occur in the theory. We paint the surface black if this held is in a state 
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corresponding to one of the two minima of its effective potential; we paint 
it white if it is in the second minimum corresponding to a different type of 
symmetry breaking, and therefore to a different set of laws of low-energy 
physics. 

In the beginning of the process the whole inflationary domain was black, 
and the distribution of both fields was very homogeneous. Then the domain 
became exponentially large (but it has the same size in comoving coordi- 
nates, as shown in Fig. 2). Each peak of the mountains corresponds to 
nearly Planckian density and can de interpreted as a beginning of a new 
“Big Bang”. The laws of physics are rapidly changing there, but they 
become fixed in the parts of the universe where the held (j) becomes small. 
These parts correspond to valleys in Figure 2. Thus quantum fluctuations of 
the scalar fields divide the universe into exponentially large domains with 
different laws of low-energy physics, and with different values of energy 
density. 

If this scenario is correct, then physics alone cannot provide a complete 
explanation for all properties of our part of the universe. The same physical 
theory may yield large parts of the universe that have diverse properties. 
According to this scenario, we And ourselves inside a four-dimensional do- 
main with our kind of physical laws not because domains with different 
dimensionality and with alternate properties are impossible or improbable, 
but simply because our kind of life cannot exist in other domains. 

This consideration is based on the anthropic principle, which was not 
very popular among physicists for two main reasons. First of all, it was 
based on the assumption that the universe was created many times until 
the Anal success. Second, it would be much easier (and quite sufficient) to 
achieve this success in a small vicinity of the Solar system rather than in 
the whole observable part of our universe. 

Both objections can be answered in the context of the theory of eternal 
inflation. First of all, the universe indeed reproduces itself in all its possible 
versions. Second, if the conditions suitable for the existence of life appear 
in a small vicinity of the Solar system, then because of inflation the same 
conditions will exist in a domain much greater than the observable part of 
the universe. This means that inflationary theory for the first time provides 
real physical justification of the anthropic principle. 

7 Stochastic approach to inflation 

The best tool for a systematic investigation of the self-reproducing infla- 
tionary universe is provided by the stochastic approach to inflation. This 
approach takes into account the most important source of quantum effect 
during inflation, the long-wavelength fluctuations of the scalar fields. 
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Fig. 2. Evolution of scalar fields (j> and x during the process of self-reproduction 
of the universe. The height of the distribution shows the value of the field (j> 
which drives inflation. The surface is painted black in those parts of the universe 
where the scalar field \ is in the first minimum of its effective potential, and white 
where it is in the second minimum. Laws of low-energy physics are different in 
the regions of different color. The peaks of the “mountains” correspond to places 
where quantum fluctuations bring the scalar fields back to the Planck density. 
Each of such places in a certain sense can be considered as a beginning of a new 
Big Bang. 



The process for generating a classical scalar field (p{x) in the inflationary 
universe can be interpreted to be the result of the Brownian motion of the 
field (j) induced by the conversion of quantum fluctuations of that field into 
a classical field </>(x) . For any given mode with fixed p, this conversion oc- 
curs whenever the physical momentum k ~ pe“^* becomes smaller than H. 
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A “freezing” of the amplitude of the held ipp{t) then occurs; see (3.7). Due 
to a phase mismatch e**”', waves with different momenta contribute to the 
classical held 4>{x) with different signs, and this also shows up in equa- 
tion (3.8), which characterizes the variance in the random distribution of 
the held that arises at the time of inflation. As in the standard diffusion 
problem for a particle undergoing Brownian motion, the mean squared par- 
ticle distance from the origin is directly proportional to the duration of the 
process (3.11). 

At any given point, the diffusion of the held (j) can conveniently be de- 
scribed by the probability distribution Pc{(j),t) to And the held (f) at that 
point and at a given instant of time t. The subscript c here serves to indicate 
the fact that this distribution also corresponds to the fraction of the original 
(comoving) coordinate volume d^x (3.2) filled by the held <j) at time t. 

It is well known that the Brownian motion of a particle can be described 
by the diffusion equation for the probability distribution P(x,t) to And the 
particle at the point x at the time t. A similar approach can be used to 
describe the Brownian motion of the scalar field during inflation. A general 
form of the diffusion equation for the held (j) in inflationary universe is given 
by [21-24] 



dPc d^p, d 

dt d(jP d(f> 




(7.1) 



Here D is the diffusion coefficient, and 6 is a mobility coefficient. This 
equation is completely analogous to the standard Fokker-Planck equation 
describing the Brownian motion; the main problem here is to derive expres- 
sions for D and b in inflationary cosmology. 

This can be done in the following simple way. First of all, mobility 
coefficient for a particle x, neglecting diffusion, is defined by equation x = 
—bF, where F is external force acting on the particle. In our case the 
analogous equation is 

dV 

+ • (7.2) 



Neglecting the first term, which is possible during inflation, we obtain the 
mobility coefficient 



1 dV 



(7.3) 



The diffusion coefficient appears because of the jumps of the scalar field. 
The amplitude of these jumps does not depend on the mass of the field (j) in 
the inflationary regime, when rn^ FI^ . Thus one can calculate it in the 
case m = 0, when ^ = 0. But for the massless field one has 



{4>^) = J Pc{4>,t) # = ^ t . 
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Differentiating this relation with respect to t and using (7.1), we obtain 



This finally gives 



dPc 

dt 




1 d / dU\ 

Stt^ 902 3iJ 90 d0 / 



(7.4) 



This equation was first derived by Starobinsky [22] by another, more rigor- 
ous method. 

This equation admits several important generalizations. First of all, 
in statistical mechanics one may ask two types of questions. The first 
question is: what is the probability for a particle to come to a point x{t) 
from a given initial point xo(t = 0)? This question is addressed by the 
equation (7.4). But one may also ask: what is the probability that a parti- 
cle came to a given point x(t) from some initial point xo(t = 0)? 

To address both of these questions simultaneously with respect to the 
field 0 one should introduce the probability distribution Pc{4>,t\(j)o). This 
distribution obeys two equations. The first one is called the forward 
Kolmogorov equation (or the Fokker-Planck equation); it generalizes equa- 
tion (7.4) for the case where H is not a constant^: 



dP,{^,t\M ^ 9 / g^/"(0) 9(g3/2(0)p^(0,^|0o)) 

dt 90 y 87 t2 90 

The second equation is the adjoint of the first one; it is called the backward 
Kolmogorov equation, 

dPo{cl),t\M ^ g3/^(0o) d 7g3/2(. 

dt 87t2 900 V 900 7 

U'(0o) 9Pc(<(),t|0o) , . 

3i7(0o) 900 ■ ^ ^ 

In this equation one considers the value of the field 0 at the time t as a 
constant, and finds the time dependence of the probability that this value 
was reached during the time t as a result of diffusion of the scalar field from 
different possible initial values 0o = 0(0). 



^The generalization is not unique (the famous difference between Stratonovich and 
Ito approaches in statistical mechanics), but this is a minor subtlety which usually does 
not lead to any qualitative changes of the results. 
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One may try to And a stationary solution of equations (7.5, 7.6), assum- 
ing that = 0. The simplest stationary solution (subexponential 

factors being omitted) would be [23,24] 

/ 3M4 \ / 3M^ \ 

(7.7) 

Here N is the overall normalization factor. This equation is extremely in- 

( 3M^ \ 

svxt) j’ 

is equal to the square of the Hartle-Hawking wave function of the uni- 

/ \ 

verse [14], whereas the second one, exp ( — )> gives the square of the 

tunneling wave function [15]. The standard derivation of these wave func- 
tions by the methods of Euclidean quantum gravity is very ambiguous, and 
interpretation of these expressions was rather confusing. For example, the 
Hartle-Hawking wave function was supposed to describe the ground state 
of the universe [14], i.e. the Anal state to which it eventually comes after a 
possible stage of inflation and oscillations near the minimum of the effective 
potential. However, for some reason this function is often (and, I believe, 
incorrectly) interpreted as describing the probability of initial conditions in 
the universe. According to this interpretation, the universe must begin its 
evolution in a state with extremely small V {(/)), which makes inflation im- 
probable. This interpretation was the foundation of the recent claims [52] 
that in the simplest inflationary models the universe must be open, with 
n ~ 10“^. The density of galaxies in such models would be exponen- 
tially small; there would be no other galaxies in the observable part of the 
universe... 

Equation (7.7) suggests that in those cases when the Hartle-Hawking 
wave function is applicable, it describes the probability to end up in a state 
with the given field (j>, whereas the tunneling wave function describes the 
probability of initial conditions. 

This would be a great peaceful resolution of the conflict between the two 
wave functions. Unfortunately, the situation is even more complicated. In 
all realistic cosmological theories, in which V {<j)) = 0 at its minimum, the 

Hartle-Hawking distribution exp ( 8V(^) ) normalizable. The source 

of this difficulty can be easily understood: any stationary distribution may 
exist only due to the compensation of the classical flow of the field 4> down- 
wards to the minimum of V{<j)) by the diffusion motion upwards. However, 
the diffusion of the field (f) discussed above exists only during inflation. 
There is no diffusion upwards from the region near the minimum of the ef- 
fective potential where there is no inflation. Therefore the expression (7.7) is 
not a true solution of the equation (7.5); all physically acceptable solutions 
for Pc are non-stationary (decaying) [20]. 
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But in the previous section we argued that the universe in a certain sense 
is stationary. It is self-reproducing, and inflation continues there forever. 
However, it is difficult to see it using the probability distribution Pc{4>, 
which ignores expansion of the universe. 

Fortunately, one can find stationary solutions describing the probability 
distribution Pp{<j),t\(j)o) introduced in [20]. This function describes the total 
physical volume of all parts of the universe containing the field ^ at a time t. 
Thus, this function takes into account different speed of exponential growth 
of the regions filled with different values of the field (f>. 

The system of stochastic equations for Pp can be obtained from 
equations (7.5, 7.6) by adding the term 3HPp, which appears due to the 
growth of physical volume of the Universe by the factor 1 -|- 3i7 (^) dt during 
each time interval dt, see [24] and references therein: 



dPp 

dt 



d 

87 t 2 dx 



dx 



V'{x) dPp 

3i7(x) dx 



+ 3H{x)Pp , 



(7.8) 



dPp 

dt 



A 

d(j} 



\ 87T^ 






(7.9) 



If one considers only those parts of the universe where the energy density is 
always smaller than the Planck density, these equations have a stationary 
solution, which can be represented in the following form [24]: 



Pp{et>AM ~ A{4>) B(M ■ (7-10) 



The structure of the result is pretty unusual and unexpected. First of all, 
the dependence on (j>, <f>Q and t is factorized. If instead of the total volume of 
the universe at a given time one is interested in the normalized probability 
distribution to find a given volume in a state with a given field, one obtains 
a time-independent result: 

Pp(0,t|</>o)~^(</>) S(0o) . (7.11) 

This equation is very similar to the equation (7.7). In particular, for 

( 3A/^ \ 

— j , which again demon- 

strates that the tunneling wave function of the universe may be used for 
the investigation of the initial conditions. But is it really necessary to study 
initial conditions for inflation? Equation (7.11) shows that the relative frac- 
tion of the volume of the universe in a state with the field (j) does not depend 
on initial conditions in the early universe and on its age. 

Thus during the last ten years inflationary theory changed considerably. 
It has broken an umbilical cord connecting it with the old Big Bang theory, 
old and new inflation, and acquired an independent life of its own. For the 
practical purposes of describing the observable part of our universe one may 
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still speak about the Big Bang, just as one can still use Newtonian gravity 
theory to describe the Solar system with very high precision. However, if 
one tries to understand the beginning of the universe, or its end, or its global 
structure, then some of the notions of the Big Bang theory become inade- 
quate. Instead of one single Big Bang producing a single-bubble universe, 
we are speaking now about inflationary bubbles producing new bubbles, 
producing new bubbles, ad infinitum. In the new theory there is no end of 
the universe evolution, and the notion of the Big Bang loses its dominant 
position, being removed to the indefinite past. 

From this new perspective many old problems of cosmology, including 
the problem of initial conditions, look much less profound than they seemed 
before. In many versions of inflationary theory it can be shown that the 
fraction of the volume of the universe with given properties (with given 
values of fields, with a given density of matter, etc.) does not depend on 
time, both at the stage of inflation and even after it. Thus each part of the 
universe evolves in time, but the universe as a whole may be stationary, and 
the properties of its parts do not depend on the initial conditions [24] . 

Of course, this happens only for the (rather broad) set of initial condi- 
tions which lead to self-reproduction of the universe. However, only finite 
number of observers live in the universes created in a state with initial 
conditions which do not allow self-reproduction, whereas infinitely many 
observers can live in the universes with the conditions which allow self- 
reproduction. Thus it seems plausible that we (if we are typical, and live 
in the place where most observers do) should live in the universe created 
in a state with initial conditions which allow self-reproduction. On the 
other hand, stationarity of the self-reproducing universe implies that an ex- 
act knowledge of these initial conditions in a self-reproducing universe is 
irrelevant for the investigation of its future evolution [24] . 



8 (P)reheating after inflation 

The theory of reheating of the universe after inflation is the most impor- 
tant application of the quantum theory of particle creation, since almost all 
matter constituting the universe was created during this process. 

At the stage of inflation all energy is concentrated in a classical slowly 
moving inflaton field f. Soon after the end of inflation this held begins to 
oscillate near the minimum of its effective potential. Eventually it produces 
many elementary particles, they interact with each other and come to a 
state of thermal equilibrium with some temperature T^. 

Elementary theory of this process was developed many years ago [25]. 
It was based on the assumption that the oscillating inflaton field can be 
considered as a collection of noninteracting scalar particles, each of which 
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decays separately in accordance with perturbation theory of particle decay. 
However, recently it was understood that in many inflationary models the 
first stages of reheating occur in a regime of a broad parametric resonance. 
To distinguish this stage from the subsequent stages of slow reheating and 
thermalization, it was called preheating [26] . The energy transfer from the 
inflaton held to other bose fields and particles during preheating is extremely 
efficient. 

To explain the main idea of the new scenario we will consider first the 
simplest model of chaotic inflation with the effective potential V {(j>) = 
and with the interaction Lagrangian ■ We will take m = 

as required by microwave background anisotropy [7], and in the beginning 
we will assume for simplicity that y particles do not have a bare mass, i.e. 

mxW =g\4'\- 

In this model inflation occurs at \(j)\ > 0.3Mp [7]. Suppose for definiteness 
that initially 4> is large and negative, and inflation ends at 0 ~ — 0.3Mp. 
After that the field 4> rolls to ^ = 0, and then it oscillates about (f> = 0 with 
a gradually decreasing amplitude. 

For the quadratic potential V{(j)) = the amplitude after the first 

oscillation becomes only 0.04Mp, i.e. it drops by a factor of ten during the 
first oscillation. Later on, the solution for the scalar field 4> asymptotically 
approaches the regime 



m 



d)(t) • sinmt , 

Mp Mp 



( 8 . 1 ) 



Here is the amplitude of oscillations, N is the number of oscillations 
since the end of inflation. For simple estimates which we will make later 
one may use 



m 



Mp 

3mt 



Mp 

201V ' 



( 8 . 2 ) 



The scale factor averaged over several oscillations grows as a{t) « 
Oscillations of (j) in this theory are sinusoidal, with the decreasing amplitude 
<j)(t) = The energy density of the field (j> decreases in the same 

way as the density of nonrelativistic particles of mass m: Ptf, = + 

~ a~^. Hence the coherent oscillations of the homogeneous scalar 
field correspond to the matter dominated effective equation of state with 
vanishing pressure. 

We will assume that g > 10“® [26], which implies gMp > lO^m for the 
realistic value of the mass m ~ 10“®Mp. Thus, immediately after the end 
of inflation, when (j) ~ Mp/3, the effective mass g\(f>\ of the field \ is much 
greater than m. It decreases when the field (j) moves down, but initially this 
process remains adiabatic, <C m^. 
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Particle production occurs at the time when the adiabaticity condition 
becomes violated, i.e. when \my\ ~ g\(j)\ becomes greater than 
This happens only when the held (j) rolls close to (/) = 0. The velocity 
of the held at that time was |0o| ~ mMp/10 « 10“^Mp. The process 
becomes nonadiabatic for g'^cj)^ < g|<^o|) he. for —0* < 4> < (/)*, where 

(/)* ~ P®]- Note that for g :$> 10“® the interval — (/)* < ^ < (/)* is 

very narrow: (/)* <C Mp/10. As a result, the process of particle production 
occurs nearly instantaneously, within the time 



At* 



100 I 



(ffl0o|) ■ 



(8.3) 



This time interval is much smaller than the age of the universe, so all effects 
related to the expansion of the universe can be neglected during the process 
of particle production. The uncertainty principle implies in this case that 
the created particles will have typical momenta k ~ (At*)“^ ~ 

The occupation number Uk of y particles with momentum k is equal to 
zero all the time when it moves toward 4> = 0. When it reaches <j) = Q (or, 
more exactly, after it moves through the small region — ^* < ^ < ^*) the 
occupation number suddenly (within the time At*) acquires the value [26] 



Uk = exp 




(8.4) 



and this value does not change until the held <j) rolls to the point 4> = 0 
again. 

To derive this equation one should first represent quantum fluctuations 
of the scalar held y minimally interacting with gravity in the following way: 

X{t,x) = J + , (8-5) 



where Ofc and a'^ are annihilation and creation operators. In general, one 
should write equations for these fluctuations taking into account expansion 
of the Universe. However, in the beginning we will neglect expansion. Then 
the functions chik obey the following equation: 



Xk + (k^ + g‘^(p'^{t))xk = 0 . (8.6) 

Equation (8.6) describes an oscillator with a variable frequency col = k"^ + 
g‘^(fp‘{t). If <j) does not change in time, then one has the usual solution 
Xk = However, when the held 4> changes, the solution be- 

comes different, and this difference can be interpreted in terms of creation 
of particles y. 
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The number of created particles is equal to the energy of particles 
jIXfcP + \^"k\Xk\^ divided by the energy u>k of each particle: 





1 

2' 



(8.7) 



The subtraction —1/2 is needed to eliminate vacuum fluctuations from the 
counting. To calculate this number, one should solve equation (8.6) and 
substitute the solutions to the equation (8.7). One can easily check that for 
the usual quantum fluctuations Xk = e“*“*’*/V2wfc one flnds Uk = 0. In the 
case described above, when the particles are created by the rapidly changed 
held (j) in the regime of strong violation of adiabaticity condition, one can 
solve equation (8.6) analytically and And the number of produced particles 
given by equation (8.4). 

One can also solve equations for quantum fluctuations and calculate Uk 
numerically. In Figure 3 we show growth of fluctuations of the held y and 
the number of particles y produced by the oscillating held (j) in the case 
when the mass of the held (j) {i.e. he frequency of its oscillations) is much 
smaller than the average mass of the held y given by g(j). 

The time evolution in Figure 3 is shown in units mj2TT, which corre- 
sponds to the number of oscillations N of the inflaton held <j>. The oscil- 
lating held (j){t) ~ <I)sinTOt is zero at integer and half-integer values of the 
variable mt /2 tt. This allows us to identify particle production with time 
intervals when </>(t) is very small. 

During each oscillation of the inflaton field (j), the held y oscillates many 
times. Indeed, the effective mass m^(t) = g4>(t) is much greater than the 
inflaton mass m for the main part of the period of oscillation of the held (j) 
in the broad resonance regime with ^ ^1. As a result, the typical 

frequency of oscillation uj{t) = \/k'^ + g‘^(j)^{t) of the held y is much higher 
than that of the held <j). That is why during the most of this interval it is 
possible to talk about an adiabatically changing effective mass But 

this condition breaks at small (j), and particles y are produced there. 

Each time when the held (j) approaches the point (/ = 0, new y particles 
are being produced. Bose statistics implies that the number of particles 
produced each time will be proportional to the number of particles produced 
before. This leads to explosive process of particle production out of the state 
of thermal equilibrium. We called this process preheating [26]. 

This process occurs not for all momenta. It is most efficient if the held 
4> comes to the point /) = 0 in phase with the held y^, which depends on 
k; see phases of the held \k for some particular values of k for which the 
process is most efficient on the upper panel of Figure 3. Thus we deal with 
the effect of the exponential growth of the number of particles y in due to 
parametric resonance. 
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Fig. 3. Broad parametric resonance for the field y in Minkowski space in the 
2,2 

theory ^ . For each oscillation of the field 0(t) the field yu oscillates many 
times. Each peak in the amplitnde of the oscillations of the field y corresponds to 
a place where (('{t) = 0. At this time the occupation number is not well defined, 
but soon after that time it stabilizes at a new, higher level, and remains constant 
until the next jump. A comparison of the two parts of this figure demonstrates the 
importance of using proper variables for the description of preheating. Both yt 
and the integrated dispersion (x^) behave erratically in the process of parametric 
resonance. Meanwhile Uk is an adiabatic invariant. Therefore, the behavior of nt 
is relatively simple and predictable everywhere except the short intervals of time 
when is very small and the particle production occurs. 



Expansion of the universe modifies this picture for many reasons. First 
of all, expansion of the universe redshifts produced particles, making their 
momenta smaller. More importantly, the amplitude of oscillations of the 
held (j) decreases because of the expansion. Therefore the frequency of os- 
cillations of the held y also decreases. This may destroy the parametric 
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2-2 

Fig. 4. Early stages of parametric resonance in the theory ^ in an expanding 
universe with scale factor a ~ for g = 5 x 10~^, m = 10~®Mp. Note that the 
number of particles Uk in this process typically increases, but it may occasion- 
ally decrease as well. This is a distinctive feature of stochastic resonance in an 
expanding universe. A decrease in the number of particles is a purely quantum 
mechanical effect which would be impossible if these particles were in a state of 
thermal equilibrium. 



resonance because it changes in an unpredictable way the phase of the os- 
cillations of the field \ each next moment when (j) becomes zero. 

That is why the number of created particles y may either increase or 
decrease each time when the field (j) becomes zero. However, a more detailed 
investigation shows that it increases three times more often than decreases, 
so the total number of produced particles grows exponentially, though in a 
rather specific way, see Figure 4. We called this regime stochastic resonance. 
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In the course of time the amplitude of the oscillations of the held <j) 
decreases, and when g(j) becomes smaller than m, particle production be- 
comes inefficient, and their number stops growing. 

In reality the situation is even more complicated. First of all, created 
particles change the frequency of oscillations of the field (p because they 
give a contribution ~ to the effective mass squared of the inflaton 

field [26]. Also, these particles scatter on each other and on the oscillating 
scalar field (p, which leads to additional particle production. As a result, 
it becomes extremely difficult to describe analytically the last stages of 
the process of the parametric resonance, even though in many cases it is 
possible to estimate the final results. In particular, one can show that 
the whole process of parametric resonance typically takes only few dozen 
of oscillations, and the final occupation numbers of particles grow up to 
nfc ~ 10^(7“^ [26]. But a detailed description of the last stages of preheating 
requires lattice simulations, as proposed by Tkachev and Khlebnikov [28]. 

The theory of preheating is very sensitive to the choice of the model. 
For example, in the theory Xp‘^/4+ ^(p'^x^ the resonance does not become 
stochastic despite expansion of the universe. However, if one adds to this 
theory even a very small term the resonance becomes stochastic [27]. 

This conclusion is illustrated by Figure 5, where we show the develop- 

2 

ment of the resonance both for the massless theory with ^ ~ 5200, and for 
the theory with a small mass m. As we see, in the purely massless theory 
the logarithm of the number density for the leading growing mode in- 
creases linearly in time x, whereas in the presence of a mass m, which we 
took to be much smaller than ^/\p during the whole process, the resonance 
becomes stochastic. 

In fact, the development of the resonance is rather complicated even for 
smaller ^ . The resonance for a massive field with m <C in this case is 
not stochastic, but it may consist of stages of regular resonance separated 
by the stages without any resonance, see Figure 6. 

Thus we see that the presence of the mass term can modify the 

nature of the resonance even if this term is much smaller than This is 
a rather unexpected conclusion, which is an additional manifestation of the 
nonperturbative nature of preheating. 

Different regimes of parametric resonance in the theory "^p"^ + + 

^re shown in Figure 7. We suppose that immediately after inflation 
the amplitude of the oscillating inflaton field is greater than If 

^ 1 the x-particles are produced in the regular stable resonance 

regime until the amplitude d)(t) decreases to after which the resonance 

occurs as in the theory ‘^p^ + [26]. The resonance never becomes 

stochastic. 
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Fig. 5. Development of the resonance in the theory + j4>‘^ + for 

2 

^ = 5200. The upper curve corresponds to the massless theory, the lower curve 
describes stochastic resonance with a theory with a mass m which is chosen to 
be much smaller than during the whole period of calculations. Nevertheless, 
the presence of a small mass term completely changes the development of the 
resonance. 



If , the resonance originally develops as in the conformally 

invariant theory but with a decrease of 4)(t) the resonance 

becomes stochastic. Again, for ^ the resonance occurs as in the 

theory ^<f)^ + In cases the resonance eventually disappears 

when the field 4)(t) becomes sufficiently small. Reheating in this class of 
models can be complete only if there is a symmetry breaking in the theory, 
i.e. rn? < 0, or if one adds interaction of the field (p with fermions. In both 
cases the last stages of reheating are described by perturbation theory [27]. 

Adding fermions does not alter substantially the description of the stage 
of parametric resonance. Meanwhile the change of sign of does lead to 
substantial changes in the theory of preheating, see Figure 8. Here we will 
briefly describe the structure of the resonance in the theory —^<p'^ + j(p‘^ + 
^(p'^X^ for various and A neglecting effects of backreaction. 

First of all, at 4) ^ mj'/X the held (p oscillates in the same way as in 
the massless theory jcp'^ + ^(p'^X^- Tbe condition for the resonance to be 
stochastic is 4> < ^ 
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Fig. 6. Development of the resonance in the theory with 

^ X(j>^ for ^ = 240. In this particular case the resonance is not stochastic. As 
time X grows, the relative contribution of the mass term to the equation describing 
the resonance also grows. This shifts the mode from one instability band to 
another. 



However, as soon as the amplitude drops down to the situation 
changes dramatically. First of all, depending on the values of parameters the 
held rolls to one of the minima of its effective potential at ^ The 

description of this process is rather complicated. Depending on the values 
of parameters and on the relation between a/ (</>^), \/ (x^) and ct = ^ , the 
universe may become divided into domains with (j) = ±cr, or it may end up 
in a single state with a definite sign of (p. After this transitional period the 
held 4> oscillates near the minimum of the effective potential at (/) = 
with an amplitude ^ cr = These oscillations lead to parametric 
resonance with y-particle production. For definiteness we will consider here 
the regime < m <C The resonance in this case is possible 

only if ^ Using the results of [26] one can show that the resonance 

is possible only for ^ > 
what earlier if the particles produced by the parametric resonance give a 
considerable contribution to the energy density of the universe.) However, 
this is not the end of reheating, because the perturbative decay of the infla- 

4 

ton held remains possible. It occurs with the decay rate V{(f) ^ xx) = fliT- 



(The resonance may terminate some- 
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Fig. 7. Schematic representation of different regimes which are possible in the 
theory for ^ 10“^Mp and for various relations between 

and A in an expanding universe. The theory developed in this paper describes 
the resonance in the white area above the line $ = The theory of preheating 
for <1? < is given in [26]. A complete decay of the inflaton is possible only if 
additional interactions are present in the theory which allow one inflaton particle 
to decay to several other particles, for example, an interaction with fermions 



This is the process which is responsible for the last stages of the decay of the 
inflaton field. It occurs only if one (/)-particle can decay into two y-particles, 
which implies that ^ 

Thus we see that preheating is an incredibly rich phenomenon. Interest- 
ingly, complete decay of the inflaton field is not by any means guaranteed. 
In most of the models not involving fermions the decay never completes. 
Efficiency of preheating and, consequently, efficiency of baryogenesis, de- 
pends in a very nonmonotonic way on parameters of the theory. This may 
lead to a certain “unnatural selection” of the theories where all necessary 
conditions for creation of matter and the subsequent emergence of life are 
satisfied. 

Bosons produced at that stage are far away from thermal equilibrium and 
have enormously large occupation numbers. Explosive reheating leads to 
many interesting effects. For example, specific nonthermal phase transitions 
may occur soon after preheating, which are capable of restoring symmetry 
even in the theories with symmetry breaking on the scale ~ 10^® GeV [29]. 
These phase transitions are capable of producing topological defects such as 
strings, domain walls and monopoles [30]. Strong deviation from thermal 
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stable resonance 




Fig. 8. Schematic representation of different regimes which are possible in the 
theory White regions correspond to the regime of a 

regular stable resonance, a small dark region in the left corner near the origin 
corresponds to the perturbative decay <j> XX- Unless additional interactions are 
included (see the previous figure), a complete decay of the inflaton field is possible 
only in this small area. 



equilibrium and the possibility of production of superheavy particles by 
oscillations of a relatively light inflaton field may resurrect the theory of 
GUT baryogenesis [31] and may considerably change the way baryons are 
produced in the Affleck-Dine scenario [32], and in the electroweak theory 
[33]. 

Usually only a small fraction of the energy of the inflaton field ~ 
is transferred to the particles y when the held (j) approaches the point 0 = 0 
for the first time [34]. The role of the parametric resonance is to increase 
this energy exponentially within several oscillations of the inflaton field. 
But suppose that the particles y interact with fermions ip with the coupling 
htpipx- If this coupling is strong enough, then y particles may decay to 
fermions before the oscillating held 0 returns back to the minimum of the 
effective potential. If this happens, parametric resonance does not occur. 
However, something equally interesting may occur instead of it: the energy 
density of the y particles at the moment of their decay may become much 
greater than their energy density at the moment of their creation. This may 
be sufficient for a complete reheating. 
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Indeed, prior to their decay the number density of x particles produced 
at the point ^ = 0 remains practically constant [26], whereas the effective 
mass of each x particle grows as = g4> when the field (j) rolls up from 
the minimum of the effective potential. Therefore their total energy density 
grows. One may say that x particles are “fattened”, being fed by the energy 
of the rolling field (j>- The fattened x particles tend to decay to fermions 
at the moment when they have the greatest mass, i.e. when (j) reaches its 
maximal value ~ 10“^Mp, just before it begins rolling back to (/) = 0. 

At that moment y particles can decay to two fermions with mass up to 

~ |10“^Mp, which can be as large as 5 x 10^^ GeV for g ~ 1. This is 
5 orders of magnitude greater than the masses of the particles which can be 
produced by the usual decay of (j) particles. As a result, the chain reaction 
(j) —>■ X considerably enhances the efficiency of transfer of energy of the 
infiaton field to matter. 

More importantly, superheavy particles if) (or the products of their de- 
cay) may eventually dominate the total energy density of matter even if in 
the beginning their energy density was relatively small. For example, the 
energy density of the oscillating infiaton field in the theory with the effective 
potential decreases as in an expanding universe with a scale factor 
a{t). Meanwhile the energy density stored in the nonrelativistic particles if) 
(prior to their decay) decreases only as a~^. Therefore their energy density 
rapidly becomes dominant even if originally it was small. A subsequent 
decay of such particles leads to a complete reheating of the universe. 

Thus in this scenario the process of particle production occurs within 
less than one oscillation of the infiaton field. We called it instant preheat- 
ing [34] . This mechanism is very efficient even in the situation when all other 
mechanisms fail. Consider, for example, models where the post-infiationary 
motion of the infiaton field occurs along a fiat direction of the effective po- 
tential. In such theories the standard scenario of reheating does not work 
because the field (p does not oscillate. Until the invention of the instant pre- 
heating scenario the only mechanism of reheating discussed in the context 
of such models was based on the gravitational production of particles [35]. 
The mechanism of instant preheating in such models typically is much more 
efficient. After the moment when y particles are produced their energy den- 
sity grows due to the growth of the field (p. Meanwhile the energy density 
of the field p moving along a fiat direction of V {p) decreases extremely 
rapidly, as a“®(t). Therefore very soon all energy becomes concentrated in 
the particles produced at the end of inflation, and reheating completes. 



9 Phase transitions and inflation after preheating 

The theory of cosmological phase transitions is usually associated with the 
processes of symmetry restoration due to high temperature effects and the 
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subsequent symmetry breaking which occurs as the temperature decreases 
in an expanding universe [36-38]. A particularly important version of this 
theory is the theory of first order cosmological phase transitions developed 
in [39] . It served as a basis for the first versions of inflationary cosmology 
[40], as well as for the theory of electroweak baryogenesis [41]. 

Recently this theory was supplemented by the possibility of nonthermal 
cosmological phase transitions [42,43], i.e. phase transitions driven by fluc- 
tuations produced so rapidly that they do not have time to thermalize. In 
some cases these phase transitions are first order [44] ; they occur by the for- 
mation of bubbles of the phase with spontaneously broken symmetry inside 
the metastable symmetric phase. If the lifetime of the metastable state is 
large enough, one may encounter a short secondary stage of inflation after 
preheating [42]. Such a secondary inflation stage could be important in 
solving the moduli and monopole problems. 

In this paper we will briefly present the theory of such phase transitions 
and then give the results of numerical lattice simulations which directly 
demonstrated the possibility of such brief inflation. 

Consider a set of scalar fields with the potential 

+ ■ (9.1) 

The inflaton scalar held <j) has a double-well potential and interacts with 
an A^-component scalar held y; = X^iliXf- For simplicity, the held 
y is taken to be massless and without self-interaction. The fields couple 
minimally to gravity in a FRW universe with a scale factor aft). We take 
A « 10“^^ [7], and assume that » A. We consider v « 10^® GeV, which 
corresponds to the GUT scale. 

The potential V y) has minima at 4> = ±u, y = 0 and a local max- 
imum at (/) = y = 0 with curvature = —Xv^. The effective potential 
acquires corrections due to quantum and/or thermal fluctuations of the 
scalar fields [6,7], AU = |A((/^)(/)^ -I- ^(y^)</>^ -I- ^(0^)y^ + where we 
have written only the leading terms depending on (j) and y. The effective 
mass squared of the held (j) is given by 

ml = -Xv^ + 3A/)2 + 3A(</>2) + g^x^)- (9.2) 

Symmetry is restored, i.e. 4> = 0 becomes a stable equilibrium point, when 
the fluctuations (/’^),(y^) become sufficiently large to make the effective 
mass squared positive at (p = 0. 

For example, one may consider matter in thermal equilibrium. Then, in 
the large temperature limit, one has {(p) = (y^) = The effective mass 
squared of the held <j) 

+ 3A/.2 + 3X{Sf,^) + g^x^) (9-3) 
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becomes positive and symmetry is restored (i.e. (j) = Q becomes the stable 
equilibrium point) for T > Tc, where ^ At this temper- 

ature the energy density of the gas of ultrarelativistic particles is given by 
p = _/V(rc)fo7)f = ■ Here N{T) is the effective number of de- 

grees of freedom at large temperature, which in realistic situations may vary 
from 10^ to 10^. Note that for this energy density is greater 

than the vacuum energy density U(0) = Meanwhile, for > X ra- 
diative corrections are important, they lead to creation of a local minimum 
of V{(f>,x), and the phase transition occurs from a strongly supercooled 
state [38]. That is why the first models of inflation required supercooling 
at the moment of the phase transition [40] . 

An exception from this rule is given by supersymmetric theories, where 
one may have g^ ^ X and still have a potential which is fiat near the origin 
due to cancellation of quantum corrections of bosons and fermions [26]. 
In such cases thermal energy becomes smaller than the vacuum energy at 
T < To, where Tq = Then one may even have a short stage of 

inflation which begins at T ~ Tq and ends at T = Tc. During this time the 
universe may inflate by the factor 



ao 



n 

Tc 



10-1 




1/4 



(9.4) 



Similar phase transitions may occur even much more efficiently prior to 
thermalization, due to the anomalously large expectation values and 
(x^) produced during preheating [42,43]. The main idea behind this theory 
is that symmetry restoration may occur even if the energy density would be 
insufficient to generate large enough fluctuations of the fields in a thermal 
distribution. The reason is that preheating preferentially excites low mo- 
mentum modes of the fields \i with anomalously high occupation numbers. 
These excited modes can thus lead to very large fluctuations (y^) relative 
to what would result if the same energy were distributed among all modes 
thermally. 

The main conclusions of [42, 43] have been confirmed by detailed in- 
vestigation using lattice simulations in [44, 45] . One of the main results 
obtained in [44] was that for g^ X nonthermal phase transitions are first 
order. They occur from a supercooled metastable vacuum at ^ = 0 due to 
creation of bubbles containing field (f> ^ 0. This result is very similar to 
the analogous result in the theory of thermal phase transitions [38]. One 
may wonder whether for g^ ^ X one may have a stage of inflation in the 
metastable vacuum ^ = 0. 
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Fig. 9. The spatial average of the inflation field 0 as a function of time. The field 
(j) is shown in units of v, the symmetry breaking parameter. Time is shown in 
Planck units. 



Analytical estimates of reference [42] suggested that this is indeed the 
case, and the degree of this inflation is expected to be 



/ 2 \ 1/4 

--(t) ’ 

ao \X J 



(9.5) 



-1 f 

\ \ J 



1/4 



in the thermal inflation 



which is much greater than the number 10 
scenario. One could also expect that the duration of inflation, just like the 
strength of the phase transition, increases if one considers N fields \i with 

1 . 



However, the theory of preheating is extremely complicated, and there 
are some factors which could not adequately take into account in the simple 
estimates of [42] . The most important factor is the effect of rescattering of 
particles produced during preheating. This effect tends to shut down the 
resonant production of particles and thus shorten or prevent entirely the 
occurrence of a secondary stage of inflation. Thus the estimates above reflect 
the maximum degree of inflation possible for a given set of parameter values, 
but in practice the expansion factor will be somewhat smaller than these 
predictions. The only way to fully account for all the effects of backreaction 
and expansion is through numerical lattice simulations. 

A numerical investigation of this model with g^/A ^ 1 was first per- 
formed in [44]. The authors found a strongly first order phase transition. 
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Fig. 10. These plots show the region of space in which symmetry breaking has 
occurred at four successive times labeled by the value of the conformal time tpr- 



The strength of the phase transition increased with an increase of j\ and 
of the number N of the fields Xi- The results become especially impressive 
for j\ = 800 and N = 20. With these parameters the phase transition is 
strongly first order, and there is a short stage of inflation prior to the phase 
transition [46]. The simulation showed that the oscillations of the inflaton 
field decreased until the field was trapped near zero (symmetry restoration) . 
It remained there until the moment of the phase transition when it rapidly 
jumped to its symmetry breaking value, as shown in Figure 9. 

This transition occurred in a nearly spherical region of the lattice which 
quickly grew to encompass the entire space. The growth of this region of 
the new phase is shown in Figure 10. The nearly perfect sphericity of this 
region indicates that the transition was strongly first-order, and occurs due 
to formation of bubbles of the phase with ^ yf 0. 

An important signature of inflation is an equation of state with negative 
pressure. Figure 11 shows the parameter a = p/p, which becomes negative 
during the metastable phase. At the moment of the phase transition the 
universe becomes matter dominated and the pressure jumps to nearly 0. 

From the beginning of this inflationary stage (roughly when the pressure 
becomes negative) to the moment of the phase transition the total expansion 
factor is 2.1. As expected, this is of the same order but somewhat lower 
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Fig. 11. The ratio of pressure to energy density p/p. As we see, pressure becomes 
negative approximately at the same time when d > 0. (Values were time averaged 
over short time scales to make the plot smoother and more readable.) 



than the predicted maximum, 




5.3. We can thus conclude that it 



is possible to achieve inflation for parameters for which this would not have 
been possible in thermal equilibrium (g'^ <C A). In our simulation we have 
shown the occurrence of a very brief stage of inflation. This stage should 
be much more significant for larger values of g^ j\. 

Thus the results of our lattice simulation verify the possibility of a sec- 
ondary stage of inflation occurring due to nonthermal effects during pre- 
heating. The expansion factor during this inflationary period was roughly 
in accord with the theoretical prediction in [42]. Although this expansion 
was very brief for the parameters we studied, we expect it to be larger for 
higher values of the coupling constant g. Such secondary stages of inflation 
could have relevance for the moduli problem, monopole production, and 
many other aspects of reheating theory. 



10 Open inflation 

Until very recently, we did not have any consistent cosmological models 
describing a homogeneous open universe. Even though the Friedmann open 
universe model did exist, it did not appear to make any sense to assume 
that all parts of an infinite universe can be created simultaneously and have 
the same value of energy density everywhere. 

A physically consistent model of open universe was proposed only after 
the invention of inflationary cosmology. (This is somewhat paradoxical, be- 
cause most of inflationary models predict that the universe must be flat.) 
The main idea was to use the well known fact that the bubbles created in the 
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process of quantum tunneling tend to have spherically symmetric shape, and 
homogeneous interior, if the tunneling probability is strongly suppressed. 
Bubble formation in the false vacuum is described by the Coleman-De 
Luccia (CDL) instantons [47]. Any bubble formed by this mechanism looks 
from inside like an infinite open universe [47, 48] . If this universe continues 
inflating inside the bubble, then we obtain an open inflationary universe. 
Then by a certain fine-tuning of parameters one can get any value of U in 
the range 0 < U < 1 [49]. 

Even though the basic idea of this scenario was pretty simple, it was 
very difficult to And a realistic open inflation model. The general scenario 
proposed in [49] was based on investigation of chaotic inflation and tunneling 
in the theories of one scalar held (j). There were many papers containing a 
detailed investigation of density perturbations, anisotropy of the microwave 
background radiation, and gravitational wave production in this scenario. 
However, no models where this scenario could be successfully realized have 
been proposed until very recently. As it was shown in [50], in the simplest 
models with polynomial potentials of the type of the 

tunneling occurs not by bubble formation, but by jumping onto the top of 
the potential barrier described by the Hawking-Moss instanton [51]. This 
process leads to formation of inhomogeneous domains of a new phase, and 
the whole scenario fails. The main reason for this failure was in fact rather 
generic: CDL instantons exist only if |U"| > during the tunneling. 
Meanwhile, inflation, which, according to [49], begins immediately after the 
tunneling, typically requires |U"| <C These two conditions are nearly 
incompatible. 

Recently an attempt has been made to describe the quantum creation 
of an open universe in the one-field models of chaotic inflation with the 
simplest potentials of the type of (/)" without any need for the Coleman- 
De Luccia bubble formation [52]. Unfortunately, all existing versions of 
this scenario lead to a structureless universe with U = 10“^ [52,53]. The 
only exception is the model proposed by Barvinsky, which is based on in- 
vestigation of the one-loop effective action in a theory of a scalar held 
with an extremely large (and fine-tuned) nonminimal coupling to grav- 
ity [54]. However, this model, just as the original model of reference [52], is 
based on the assumption that the quantum creation of the universe is de- 
scribed by the Hartle-Hawking wave function [14]. Meanwhile, according to 
[7,15,24,53], the Hartle-Hawking wave function describes the ground state of 
the universe rather than the probability of the quantum creation of the uni- 
verse. Instead of describing the creation of the universe and its subsequent 
relaxation to the minimum of the effective potential, which is the essence of 
inflationary theory, it asserts that a typical universe from the very beginning 
is in the ground state corresponding to the minimum of V{4>). This is the 
main reason why the Hartle-Hawking wave function fails to predict a long 
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stage of inflation and reasonably large in most of inflationary models. 
Another problem of this scenario is related to the singular nature of the 
Hawking- Turok instanton [53,55]. 

Thus, for a long time we did not have any consistent and unambiguous 
realization of the one-fleld open universe model predicting 0.2 < < 1, 

neither of the type of [49], nor of the type of [52]. A consistent model 
of one-fleld open inflation was proposed only very recently [56]. Effective 
potential in this model has a sharp peak at large (j). During the tunneling and 
soon after it the condition |y"| ^ is violated in this model. As a result, 
spectrum of density perturbations has a deep minimum on the horizon scale. 
This leads to several distinctive features of the CMB anisotropy in this class 
of models [57]. 

Some problems of the open inflation models can be avoided if one con- 
siders models of two scalar fields [50] . In this scenario the bubble formation 
occurs due to tunneling with respect to one of the fields which has a steep 
barrier in its potential. Meanwhile the role of the inflaton inside the bubble 
is played by another held, rolling along a flat direction “orthogonal” to the 
direction of quantum tunneling. Inflationary models of this type have many 
interesting features. In these models the universe consists of infinitely many 
expanding bubbles immersed into an exponentially expanding false vacuum 
state. Interior of each of these bubbles looks like an infinitely large open 
universe, but the values of H in these universes may take any value from 1 
to 0. 

Many versions of these two-field models have been considered in the re- 
cent literature, for a review see e.g. [58]. Strictly speaking, however, the 
two-field models describe quasi-open universes rather than the open ones. 
The reason why the interior of the bubble in the one-fleld model can be as- 
sociated with an open universe is based on the possibility to use this held as 
a clock, which is most suitable for the description of the processes inside the 
bubble from the point of view of an internal observer. If one has two fields, 
they are not always perfectly synchronized, which may lead to deviations of 
the internal geometry from the geometry of an open universe [50] and may 
even create exponentially large quasi-open regions with different H within 
each bubble [59] . A detailed investigation of observational consequences of 
such models can be found in [60] . 

All these models are rather complicated, and it is certainly true that 
the models which lead to = 1 are much more abundant and natural. 
Therefore flatness of the universe remains a robust prediction of most of 
the inflationary models. Moreover, recent observational data indicate that 
the total value of H, including the contribution of the vacuum energy is 
close to 1 [61]. Thus, it is encouraging that inflationary theory is versatile 
enough to include models with all possible values of fl, but hopefully we 
will not need to use this versatility. 
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11 Towards inflation in supergravity: Hybrid inflation 



Even though the main principles of inflationary cosmology by now are well 
understood, it is not so simple to construct inflationary models in a context 
of realistic theories of elementary particles, such as supergravity and string 
theory. Another problem may arise if observational data will indicate that 
the universe is not flat but open or closed. In the subsequent sections we 
will briefly describe some recent ideas which may help us to cope with these 
problems. 

The main problem here is that the effective potential for the inflaton 
held in supergravity typically is too curved, growing as exp at large 

values of <j). The typical value of the parameter C is 0(1), which makes 
inflation impossible because the inflaton mass in this theory becomes of the 
order of the Hubble constant. 

To understand the origin of this problem, let us remember expression 
for the effective potential in = 1 supergravity with the minimal Kahler 
potential K = and with superpotential W{^)-. 



U(<j)) = exp 






d^W + 



<j)* 



W 



2 




( 11 . 1 ) 



Here Mpi is stringy Planck mass Mpi = Mp/ ~ 2 x 10^® GeV. 

The term exp (7^) blows up at ^ Mpp The situation is not much 
better in string theory. Typical shape of the scalar held potential in string 
theory is exp ^ with C = 0(1), which also prevents inflation for (f) > Mp. 

In the context of supergravity this problem can be avoided by investiga- 
tion of models with non- minimal Kahler potential and/or specific superpo- 
tentials W . Perhaps the easiest way to And a consistent inflationary model 
in supersymmetric theories is based on the hybrid inflation scenario [62]. 
The simplest version of this scenario is based on chaotic inflation in the 
theory of two scalar fields with the effective potential 

U(a, 0) = Aa^)" + + y . (11.2) 

The effective mass squared of the held cr is equal to — M^ -I- . Therefore 

for (j) > (j)c = M / g the only minimum of the effective potential V{a, </>) is at 
(7 = 0. The curvature of the effective potential in the cr-direction is much 
greater than in the (/-direction. Thus at the first stages of expansion of the 
universe the held cr rolled down to cr = 0, whereas the held (j) could remain 
large for a much longer time. 
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At the moment when the inflaton held (j) becomes smaller than (f>c = 
M/g, the phase transition with the symmetry breaking occurs. If = 

<C M^/X, the Hubble constant at the time of the phase transition 
is given by H'^ = ■ If one assumes that and that , 

then the universe at (j) > (f>c undergoes a stage of inflation, which abruptly 
ends at (j> = (j>c- 

One of the advantages of this scenario is the possibility to obtain small 
density perturbations even if coupling constants are large, X, g = 0(1). This 
scenario works if the effective potential has a relatively flat ^direction. But 
flat directions often appear in supersymmetric theories. This makes hybrid 
inflation an attractive playground for those who wants to achieve inflation 
in supergravity. 

Another advantage of this scenario is a possibility to have inflation at 
(j) Mp. This helps to avoid problems which may appear if the effective 
potential in supergravity and string theory blows up at (/) > Mp. 

The simplest choice of the superpotential which leads to hybrid inflation 
is [63] 

W = S{K^(j3- fi^), (11.3) 

with K <C 1 . Here (j) and </> denote conjugate pairs of superflelds transform- 
ing as non-trivial representations of some gauge group G under which the 
superfleld S is neutrally charged. 

The effective potential in a globally supersymmetric theory with the 
superpotential (2.1) is given by 

(i<(>p + i0n + (11-4) 

Here a = ^/2S is a canonically normalized scalar held. The absolute min- 
imum appears at cr = 0, ()) = ^ = However, for cr > CTc = the 

flelds (j) and (f) possess a positive mass squared and stay at the origin. This 
potential for (/) = (/) = 0 is exactly flat in the cr-direction. If one simply 
adds a mass term rr?a^ which softly breaks supersymmetry, one obtains 
a simple realization of the hybrid inflation scenario [63] : initially the scalar 
held a is large. It slowly rolls down to cr = CTc. Then the curvature of the 
effective potential in the ^direction becomes negative, the flelds rapidly roll 
to the absolute minimum of the effective potential leading to the symmetry 
breaking of the group G, and inflation ends, as in the original version of the 
hybrid inflation scenario [62]. 

However, one does not need to add the term m^cr^/2 by hand, if one 
takes into account radiative corrections and supergravity effects. The one- 
loop effective potential in this model is easily calculated from the spectrum 
of the theory composed by two pairs of real and pseudoscalar flelds with 
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mass squared ± and a Dirac fermion with mass ku j\pi. The 

one-loop effective potential is given by [64] 



2 . 2 , 



2 , o„2^2, + V o„2 4. 



-\-{na^ + 2^ Y In 



-2k. a In— , (11.5) 



where A indicates the renormalization scale. 

The supergravity potential for (j) = (j) = Q (ignoring the one-loop correc- 
tions calculated above) is given by [63] Usugra = ^^e'^^/^^1 — ^ + x) = 

^^(l -I- ^ -I- ...y Inffation with this potential is possible because of the 

cancellation of the quadratic term ~ but still it may occur 

only at (J < 1. Notice that the cancellation of this quadratic term derives 
from the general form of the typical superpotential during hybrid inffation, 
W ^ However, this cancellation is only operative with the choice of 

minimal Kahler potential. 

One-loop corrections and supergravity effects make the effective poten- 
tial at ^ ^ = 0 not exactly flat in the cr-direction. Thus the scalar field 

a slowly moves towards Uc, and then inflation abruptly ends. A detailed 
description of inflation in this scenario is contained in reference [65] . 

Another interesting version of the hybrid inflation scenario is the so- 
called D-term inflation [66]. A thorough discussion of various versions of 
hybrid inflation in supersymmetric theories can be found in a review article 
by Lyth and Riotto [67]. 



12 Pre-Big Bang 

Thus, it is possible to obtain inflation in supergravity. The progress with re- 
alization of inflationary scenario in the context of string theory is more mod- 
est, in part because interpretation of string theory (or M-theory?) changes 
every year. But it might be possible that inflation can be implemented 
in the simplest versions of string theory, though in a rather nontrivial way. 
One of these possibilities is related to the Pre-Big-Bang (PBB) scenario [68]. 

The dynamics of the model is given by the four-dimensional effective 
action of string theory, which contains the metric, dilaton and possibly 
other matter fields. In the string frame, to the lowest order, the action can 
be written as [68] 

S = ( 12 . 1 ) 
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where a is the dimensionless dilaton held, which determines the string cou- 
pling constant = exp(<T/2), is the string-frame metric and y denotes 
any additional matter degrees of freedom. The dimensional parameter Is is 
the string scale, and is close to today’s value of the Planck scale Ip ~ M~^. 
The simplest variant of PBB has been given for spatially flat FRW back- 
grounds, and has since been investigated in more complicated situations. 
The crux of the scenario is that any solution starting in the weak coupling 
regime gs I always evolves such that the dilaton grows (z.e., the coupling 
increases), until it dominates the evolution. Then the universe begins the 
accelerated expansion (pre-Big-Bang inflation), and the scale factor grows 
superexponentially towards a Big Bang singularity in the future. But once 
the theory enters strong coupling regime, it becomes more convenient to 
describe its evolution not with respect to the string scale Is, but with re- 
spect to the Planck scale Ip. Since that moment, the usual post-Big-Bang 
description takes over [68]. 

Unfortunately, it is extremely difficult to match the stage of the pre- 
Big-Bang inflation and the subsequent post-Big-Bang evolution [69]. It is 
difficult to obtain density perturbations with a nearly-flat spectrum in this 
scenario. Recently it was found also that this scenario requires the universe 
to be exponentially large, flat and homogeneous prior to the stage of the 
pre-Big-Bang inflation [70,71]. For example, in the standard Big Bang 
theory the initial size of the homogeneous part of the universe must be 
approximately 10^° times greater than the Planck scale. This constitutes 
the flatness problem. This problem is easily resolved in chaotic inflation 
scenario because in this scenario one should not make any assumptions 
about initial properties of the universe on the scale greater than the Planck 
length. Meanwhile, to solve the flatness problem {i.e. to explain the origin 
of the large number 10^®) in the context of the pre-Big Bang theory one must 
introduce two independent large dimensionless parameters, g^^ > 10®^, and 
B > 10®^ [71]. Thus it seems very difficult to replace the usual post-Big- 
Bang inflation by the pre-Big-Bang one. And if the post-Big-Bang inflation 
is indeed necessary, then the pre-Big-Bang stage will have no observational 
manifestations. However, this subject is rather complicated and it is too 
early to make final judgements about it. In any case, the possibility that 
the Big Bang is not a beginning but a phase transition from a stringy 
phase is very exciting. This possibility, originally proposed in reference [72] 
independently of the PBB scenario, deserves further investigation. 



13 Brane world 



One of the most interesting new trends which appeared recently in particle 
physics and cosmology is investigation of the possibility that we live on a 
4-dimensional brane in a higher-dimensional universe. This possibility goes 
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back to the old paper by Rubakov and Shaposhnikov of 1983 [73], but it was 
revived on a qualitatively different level after the papers by Arkani-Hamed 
et al. [74], and by Randall and Sundrum [75]. 

This is a very exciting new development, and the number of new pa- 
pers on this subject grows explosively. Some of these papers investigate 
phenomenological consequences of the new paradigm, some study possible 
changes in cosmology. We will be unable to give here even a short review 
of all of the new developments, not only because of the large amount of 
new work which has been done in this direction, but also because the whole 
subject is still rather contradictory. 

As an example we will discuss here one of the most interesting possi- 
bilities discovered by Randall and Sundrum [75]. They have found that if 
two domains of 5-dimensional anti-de Sitter space (AdS^) with the same 
(negative) value of vacuum energy (cosmological constant) are divided by a 
thin (delta-functional) domain wall, then under certain circumstances the 
metric for such configuration can be represented as 

ds^ = + dr^ . (13.1) 

Here A{r) = — fc|r|, k > 0, r is the coordinate corresponding to the fifth 
dimension. Amazing property of this solution is that because of the ex- 
ponentially rapid decrease of the factor away from the domain wall, 

gravity in a certain sense becomes localized near the brane [75]. Instead of 
the 5d Newton law F ~ 1/R^, where R is the distance along the brane, one 
finds the usual 4d law F ~ IjR^. Therefore, if one can ensure confinement 
of other fields on the brane, one may obtain higher-dimensional space which 
effectively looks like 4d space without use of Kaluza-Klein compactification! 

However, to obtain this configuration it was necessary to postulate exis- 
tence of delta- functional domain walls with specific properties. It would be 
nice to make a step from phenomenology and obtain the domain wall config- 
uration described in [75] as a classical solution of some supersymmetric the- 
ory. Very soon after this goal was formulated, several authors claimed that 
they indeed obtained a supersymmetric realization of the Randall-Sundrum 
scenario. Then some of them withdrawn their statements, whereas many 
others continued producing papers containing this claim. Some of the au- 
thors did not notice that they obtained solutions with A(r) = -l-A:|r|, growing 
at large r, which does not lead to localization of gravity on the brane. Some 
others obtained the desirable regime A{r) = — fc|r| because of the sign er- 
ror in their equations. In some other papers the authors used functions 
W which they called superpotentials, but they have chosen such functions 
which cannot appear in supergravity. As a result, the situation became 
rather confusing. 
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The simplest supersymmetric theory in 5d space is fV = 2 gauged super- 
gravity. It can be naturally formulated in AdS^, so one would expect that 
this theory is the best candidate for implementing the Randall-Sundrum 
scenario. But until very recently it was believed that these theories pos- 
sess only one AdS^ vacuum, so there was no reason to expect existence of 
domain walls separating two different AdS^ states. It was expected that 
even if one can naively And different AdS^ vacua, in all but one of them 
one should have scalar or vector fields with a wrong sign of kinetic energy, 
which would mean that such states are unstable. 

The first important step towards resolving this problem was made by 
Behrndt and Cvetic [76]. They have found two different AdS 5 vacuum states 
where scalar fields have correct kinetic terms and obtained an interpolating 
domain wall solution. However, they did not check whether kinetic terms of 
vector fields are correct. More importantly, vacuum energies (cosmological 
constants) of the two AdS^ solutions obtained in [76] were different, so 
they were unsuitable for the Randall-Sundrum scenario. In the beginning 
Behrndt and Cvetic thought that their domain wall solution has properties 
similar to the Randall-Sundrum domain wall, but later they have found that 
this is not the case. 

Soon after that, Kallosh, Shmakova and I found a model which admits a 
family of different AdS^ spaces with equal values of vacuum energy and with 
proper signs of kinetic energy for scalar and vector fields [77]. It seemed 
that this provided a proper setting for realization of the Randall-Sundrum 
scenario. Indeed, we have found a domain wall configuration separating 
two different AdS^ spaces with equal vacuum energies. This configuration 
has metric (13.1). However, instead of A{r) = — fc|r|, which is necessary 
for localization of gravity on the wall, we have found that A(r) ~ -l-A:|rj at 
large |rj. At small |r| the function A{r) is singular. It behaves as log |rj at 
jrj ^ 0. Metric near the domain wall is given by 

ds^ = + dr^. (13.2) 

This implies the existence of a singularity at r = 0, which separates the 
universe into two parts. Thus, instead of a smooth domain wall where we 
could live there is a naked singularity at r ^ 0. 

We were able to show that this is a general property of a broad class of 
models based on = 2 d = 5 gauge supergravity. In these models there are 
no solutions of the Randall-Sundrum type [77]. A similar situation occurs 
in other versions of supergravity. Despite many attempts, all solutions ob- 
tained so far in various versions of supergravity do not lead to localization 
of gravity on a brane. Perhaps there is a way to overcome this difficulty, 
but at the moment it seems very difficult to make the Randall-Sundrum 
scenario compatible with supersymmetry. It is possible to And a realiza- 
tion of this scenario in non-supersymmetric theories for some fine-tuned 
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potentials see e.g. [78], but it is not quite clear how this fine tuning 

could survive radiative corrections. We hope that it is not really necessary to 
make a choice between the Randall-Sundrum scenario and supersymmetry, 
but so far the situation does not seem very encouraging. 

Another interesting possibility is that we live not on a domain wall but on 
a 4d (3 space and one time) cosmic string in a 6d world. In particular, Cohen 
and Kaplan have found that gravity becomes localized near the 4d global 
string [79]. This would be a viable alternative to the Randall-Sundrum 
scenario. An unpleasant feature of their solution is the existence of a naked 
singularity. The authors argued that this singularity may not cause any 
problems if one treat it quantum mechanically. 

It is well known, for example, that some solutions for the wave function 
of an electron in a hydrogen atom must be excluded because they describe 
disappearance of electrons in the center of the atom at r = 0. Indeed, proton 
is not a hole in space, the number of electrons in quantum mechanics must 
be conserved, so the solutions of that type are physically impossible. Cohen 
and Kaplan argued that one should treat naked singularity in a similar way, 
excluding wave functions describing particles disappearing in the singularity. 
This would render the singularity harmless. However, it is not quite clear 
whether this is a legitimate approach. After all, singularity is a hole in 
space, so it does not seem possible to insist that particles cannot disappear 
there. 

Fortunately, according to Ruth Gregory [80], naked singularity can be 
avoided if one considers a global string in space with a small negative cos- 
mological constant. However, as she pointed out, this requires exponentially 
accurate fine-tuning of this negative cosmological constant, which makes the 
whole scenario rather vulnerable. 

It is too early to make any final conclusions about the brane new world 
of cosmology. It is an exciting and rapidly developing field, but there is still 
a large gap between attractive phenomenological models and fundamental 
physics. Hopefully this gap will be closed by further investigation. 



14 Conclusions 

During the last 20 years inflationary theory gradually became the standard 
paradigm of modern cosmology. But this does not mean that all difficul- 
ties are over and we can relax. First of all, inflation is still a scenario 
which changes with every new idea in particle theory. Do we really know 
that inflation began at Planck density 10®"* g/cm^? What if our space 
has large internal dimensions, and energy density could never rise above the 
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electroweak density 10^® g/cm^? Was there any stage before inflation? Is 
it possible to implement inflation in string theory/M-theory? 

We do not know which version of inflationary theory will survive ten 
years from now. It is absolutely clear than new observational data are going 
to rule out 99% of all inflationary models. But it does not seem likely that 
they will rule out the basic idea of inflation. Inflationary scenario is very 
versatile, and now, after 20 years of persistent attempts of many physicists 
to propose an alternative to inflation, we still do not know any other way to 
construct a consistent cosmological theory. Thus, for the time being, we are 
taking the position suggested long ago by Sherlock Holmes: “when you have 
eliminated the impossible, whatever remains, however improbable, must be 
the truth?” [81]. Did we really eliminate the impossible? Do we really know 
the truth? It is for you to And the answer. 



References 

[1] A.A. Starobinsky, JETP Lett. 30 (1979) 682; Phys. Lett. 91B (1980) 99. 

[2] V.F. Mukhanov and G.V. Chibisov, JETP Lett. 33 (1981) 532. 

[3] S.W. Hawking, Phys. Lett. 115B (1982) 295; A.A. Starobinsky, ibid. 117B (1982) 
175; A.H. Guth and S.-Y. Pi, Phys. Rev. Lett. 49 (1982) 1110; J. Bardeen, P.J. 
Steinhardt and M.S. Turner, Phys. Rev. D 28 (1983) 679. 

[4] V.F. Mukhanov, JETP Lett. 41 (1985) 493; V.F. Mukhanov, H.A. Feldman and 

R. H. Brandenberger, Phys. Rept. 215 (1992) 203. 

[5] A.H. Guth, Phys. Rev. D 23 (1981) 347. 

[6] D.A. Kirzhnits, JETP Lett. 15 (1972) 529; D.A. Kirzhnits and A.D. Linde, Phys. 
Lett. 42B (1972) 471; Sov. Phys. JETP 40 (1974) 628; Ann. Phys. 101 (1976) 195; 

S. Weinberg, Phys. Rev. D 9 (1974) 3320; L. Dolan and R. Jackiw, Phys. Rev. D 9 
(1974) 3357. 

[7] A.D. Linde, Particle Physics and Inflationary Cosmology (Harwood, Chur, 
Switzerland, 1990). 

[8] A.H. Guth and E.J. Weinberg, Nucl. Phys. B 212 (1983) 321. 

[9] A.D. Linde, Phys. Lett. 108B (1982) 389; 114B (1982) 431; 116B (1982) 335, 340; 
A. Albrecht and P.J. Steinhardt, Phys. Rev. Lett. 48 (1982) 1220. 

[10] A.D. Linde, Phys. Lett. 129B (1983) 177. 

[11] A. Vilenkin and L.H. Ford, Phys. Rev. D 26 (1982) 1231; A.D. Linde, Phys. Lett. 
116B (1982) 335; A.A. Starobinsky Phys. Lett. 117B (1982) 175. 

[12] Ya.B. Zeldovich, Sov. Astron. Lett. 7 (1981) 322; L.P. Grishchuk and Ya.B. 
Zeldovich, in Quantum Structure of Space-Time, edited by M. Duff and C.J. Isham 
(Cambridge University Press, Cambridge, 1983) p. 353. 

[13] A. Vilenkin, Phys. Lett. B 117 (1982) 25. 

[14] J.B. Hartle and S.W. Hawking, Phys. Rev. D 28 (1983) 2960. 

[15] A.D. Linde, Sov. Phys. JETP 60 (1984) 211; Lett. Nuovo Cimento 39 (1984) 401. 

[16] Ya.B. Zeldovich and A.A. Starobinsky, Sov. Astron. Lett. 10 (1984) 135. 

[17] V.A. Rubakov, Phys. Lett. 148B (1984) 280. 




394 



The Primordial Universe 



[18] A. Vilenkin, Phys. Rev. D 30 (1984) 549. 

[19] P.J. Steinhardt, in The Very Early Universe, edited by G.W. Gibbons, S.W. 
Hawking and S. Siklos (Gambridge U.P. Gambridge, England, 1982) p. 
251; A.D. Linde, Nonsingular Regenerating Inflationary Universe (Cambridge 
University preprint, 1982); A. Vilenkin, Phys. Rev. D 27 (1983) 2848. 

[20] A.D. Linde, Phys. Lett. B 175 (1986) 395; A.S. Goncharov, A. Linde and V.F. 
Mukhanov, Int. J. Mod. Phys. A 2 (1987) 561. 

[21] A. Vilenkin, Phys. Rev. D 27 (1983) 2848. 

[22] A. A. Starobinsky, in Fundamental Interactions (in Russian, MGPI, Moscow, 1984) 
p. 55. 

[23] A. A. Starobinsky, in Current Trends in Field Theory, Quantum Gravity, and 
Strings, Lecture Notes in Physics, edited by H.J. de Vega and N. Sanches (Springer- 
Verlag, Heidelberg, 1986). 

[24] A.D. Linde, D.A. Linde and A. Mezhlumian, Phys. Rev. D 49 (1994) 1783. 

[25] A.D. Dolgov and A.D. Linde, Phys. Lett. 116B (1982) 329; L.F. Abbott, E. Fahri 
and M. Wise, Phys. Lett. 117B (1982) 29. 

[26] L.A. Kofman, A.D. Linde and A. A. Starobinsky, Phys. Rev. Lett. 73 (1994) 3195; 

L. Kofman, A. Linde and A. A. Starobinsky, Phys. Rev. D 56 (1997) 3258-3295. 

[27] P.B. Greene, L. Kofman, A.D. Linde and A. A. Starobinsky, Phys. Rev. D 56 (1997) 
6175-6192 [hep-ph/9705347]. 

[28] S. Khlebnikov and LI. Tkachev, Phys. Rev. Lett. 77 (1996) 219; Phys. Lett. B 
390 (1997) 80; Phys. Rev. Lett. 79 (1997) 1607; Phys. Rev. D 56 (1997) 653; 
T. Prokopec and T.G. Roos, Phys. Rev. D 55 (1997) 3768; B.R. Greene, T. Prokopec 
and T.G. Roos, Phys. Rev. D 56 (1997) 6484. 

[29] L.A. Kofman, A.D. Linde and A. A. Starobinsky, Phys. Rev. Lett. 76 (1996) 1011; 

I. Tkachev, Phys. Lett. B 376 (1996) 35. 

[30] I. Tkachev, L.A. Kofman, A.D. Linde, S. Khlebnikov and A. A. Starobinsky 
(in preparation). 

[31] E.W. Kolb, A. Linde and A. Riotto, Phys. Rev. Lett. 77 (1996) 4960. 

[32] G.W. Anderson, A.D. Linde and A. Riotto, Phys. Rev. Lett. 77 (1996) 3716. 

[33] J. Garci'a-Bellido, D. Grigorev, A. Kusenko and M. Shaposhnikov [hep-ph/9902449]; 

J. Garci'a-Bellido and A. Linde, Phys. Rev. D 57 (1998) 6075. 

[34] G. Felder, L.A. Kofman and A.D. Linde [hep-ph/9812289]. 

[35] L.H. Ford, Phys. Rev. D 35 (1987) 2955; B. Spokoiny, Phys. Lett. B 315 (1993) 40; 

M. Joyce, Phys. Rev. D 55 (1997) 1875; M. Joyce and T. Prokopec, Phys. Rev. D 
57 (1998) 6022; P.J.E. Peebles and A. Vilenkin [astro-ph/9810509]. 

[36] D.A. Kirzhnits, JETP Lett. 15 (1972) 529; D.A. Kirzhnits and A.D. Linde, Phys. 
Lett. 42B (1972) 471. 

[37] S. Weinberg, Phys. Rev. D 9 (1974) 3320; L. Dolan and R. Jackiw, Phys. Rev. D 9 
(1974) 3357; D.A. Kirzhnits and A.D. Linde, Sov. Phys. JETP 40 (1974) 628. 

[38] D.A. Kirzhnits and A.D. Linde, Ann. Phys. 101 (1976) 195. 

[39] D.A. Kirzhnits and A.D. Linde, Ann. Phys. (N.Y.) 101 (1976) 195. 

[40] A.H. Guth, Phys. Rev. D 23 (1981) 347; A.D. Linde, Phys. Lett. 108B (1982) 389; 
A. Albrecht and P.J. Steinhardt, Phys. Rev. Lett. 48 (1982) 1220. 

[41] V.A. Kuzmin, V.A. Rubakov and M.E. Shaposhnikov, Phys. Lett. 155B (1985) 36; 
M.E. Shaposhnikov, JETP Lett. 44 (1986) 465; Nucl. Phys. B 287 (1987) 757; Nucl. 
Phys. B 299 (1988) 797. 

[42] L. Kofman, A. Linde and A. A. Starobinsky, Phys. Rev. Lett. 76 (1996) 1011. 

[43] LI. Tkachev, Phys. Lett. B 376 (1996) 35. 




A. Linde: Inflation and Creation of Matter 



395 



[44] S. Khlebnikov, L. Kofman, A. Linde and I. Tkachev, Phys. Rev. Lett. 81 (1998) 
2012 [hep-ph/9804425]. 

[45] I. Tkachev, S. Khlebnikov, L. Kofman and A. Linde, Phys. Lett. B 440 (1998) 
262-268 [hep-ph/9805209]. 

[46] G. Felder, L. Kofman, A. Linde and I. Tkachev, Inflation after Preheating 
(in preparation). 

[47] S. Coleman and F. De Luccia, Phys. Rev. D 21 (1980) 3305. 

[48] J.R. Gott, Nat 295 (1982) 304; J.R. Gott and T.S. Statler, Phys. Lett. 136B (1984) 
157. 

[49] M. Bucher, A.S. Goldhaber and N. Turok, Phys. Rev. D 52 (1995) 3314; K. 
Yamamoto, M. Sasaki and T. Tanaka, ApJ 455 (1995) 412. 

[50] A.D. Linde, Phys. Lett. B 351 (1995) 99; A.D. Linde and A. Mezhlumian, Phys. 
Rev. D 52 (1995) 6789. 

[51] S.W. Hawking and LG. Moss, Phys. Lett. IlOB (1982) 35. 

[52] S.W. Hawking and N. Turok, Phys. Lett. B 425 (1998) 25; S.W. Hawking and N. 
Turok, Phys. Lett. R 432 (1998) 271 [hep-th/9803156]. 

[53] A.D. Linde, Phys. Rev. D 58 (1998) 083514 [gr-qc/9802038]. 

[54] A.O. Barvinsky [hep-th/9806093, gr-qc/9812058]. 

[55] A. Vilenkin [hep-th/9803084]; W. Unruh [gr-qc/9803050]; R. Bousso and A.D. Linde 
[gr-qc/9803068]; J. Garriga [hep-th/9803210, hep-th/9804106]; R. Bousso and A. 
Chamblin [hep-th/9805167]. 

[56] A.D. Linde, Phys. Rev. D 59 (1999) 023503. 

[57] A.D. Linde, M. Sasaki and T. Tanaka [astro-ph/9901135]. 

[58] J. Garci'a-Bellido [hep-ph/9803270]. 

[59] J. Garci'a-Bellido, J. Garriga and X. Montes, Phys. Rev. D 57 (1998) 4669. 

[60] J. Garci'a-Bellido, J. Garriga and X. Montes [hep-ph/9812533]. 

[61] P.D. Mauskopf et al. [astro-ph/9911444]; A. Melchiorri et at. [astro-ph/9911445]. 

[62] A.D. Linde, Phys. Lett. B 259 (1991) 38; Phys. Rev. D 49 (1994) 748. 

[63] E.J. Copeland, A.R. Liddle, D.H. Lyth, E.D. Stewart and D. Wands, Phys. Rev. D 
49 (1994) 6410. 

[64] G. Dvali, Q. Shafi and R.K. Schaefer, Phys. Rev. Lett. 73 (1994) 1886. 

[65] A.D. Linde and A. Riotto, Phys. Rev. D 56 (1997) 1841. 

[66] P. Binetruy and G. Dvali, Phys. Lett. B 388 (1996) 241; E. Halyo, Phys. Lett. B 
387 (1996) 43. 

[67] D.H. Lyth and A. Riotto [hep-ph/9807278] . 

[68] G. Veneziano, Phys. Lett. B 265 (1991) 287; M. Gasperini and G. Veneziano, 
Astropart. Phys. 1 (1993) 317. 

[69] R. Brustein and G. Veneziano, Phys. Lett. B 329 (1994) 429; N. Kaloper, R. Madden 
and K.A. Olive, Nucl. Phys. B 452 (1995) 677; E.J. Copeland, A. Lahiri and D. 
Wands, Phys. Rev. D 50 (1994) 4868; N. Kaloper, R. Madden and K.A. Olive, Phys. 
Lett. B 371 (1996) 34; R. Easther, K. Maeda and D. Wands, Phys. Rev. D 53 (1996) 
4247. 

[70] M.S. Turner and E. Weinberg, Phys. Rev. D 56 (1997) 4604. 

[71] N. Kaloper, A. Linde and R. Bousso [hep-th/9801073]. 

[72] R. Brandenberger and C. Vafa, Nucl. Phys. B 316 (1989) 391. 

[73] V. Rubakov and M. Shaposhnikov, Phys. Lett. B 125 (1983) 136. 

[74] N. Arkani-Hamed, S. Dimopoulos and G. Dvali, Phys. Lett. B 429 (1998) 263 
[hep-ph/9807344]. 

[75] L. Randall and R. Sundrum, Phys. Rev. Lett. 83 (1999) 4690 [hep-th/9906064]. 




396 



The Primordial Universe 



[76] K. Behrndt and M. Cvetic [hep-th/9909058]. 

[77] R. Kallosh, A. Linde and M. Shmakova, JHEP 11 (1999) 010 [hep-th/9910021]; R. 
Kallosh and A. Linde [hep-th/0001071]. 

[78] A. Chamblin and G.W. Gibbons [hep-th/9909130]; O. DeWolfe, D.Z. Freedman, S.S. 
Gubser and A. Karch [hep-th/9909134]; M. Gremm [hep-th/9912060] . 

[79] A.G. Gohen and D.B. Kaplan (1999) [hep-th/9910132], 

[80] R. Gregory (1999) [hep-th/9911015]. 

[81] A. Conan Doyle, The Sign of Four, Chapter 6, 
http:/ /www.litrix.com/sign4/signo006.htm 




COURSE 8 

COSMOLOGICAL CONSTANT VS. QUINTESSENCE 

P. BINETRUY 



LPT, Universite Paris-Sud, France 





Contents 



1 Cosmological constant 399 

2 The role of supersymmetry 401 

3 Observational results 402 

4 Quintessence 404 

4.1 Runaway quintessence 406 

4.2 Pseudo-Goldstone boson 413 

5 Quintessential problems 414 

6 Extra spacetime dimensions 417 

7 Conclusion 420 




COSMOLOGICAL CONSTANT VS. QUINTESSENCE 



P. Binetruy 



Abstract 

There is some evidence that the Universe is presently undergoing ac- 
celerating expansion. This has restored some credit to the scenarios 
with a non-vanishing cosmological constant. From the point of view 
of a theory of fundamental interactions, one may argue that a dy- 
namical component with negative pressure is easier to achieve. As 
an illustration, the quintessence scenario is described and its short- 
comings are discussed in connection with the nagging “cosmological 
constant problem”. 



1 Cosmological constant 



As is well known, the cosmological constant appears as a constant in the 
Einstein equations: 

Rfiv ( 1 ) 

where Gn is Newton’s constant and is the energy-momentum tensor. 
The cosmological constant A is thus of the dimension of an inverse length 
squared. It was introduced by Einstein [1,2] in order to build a static uni- 
verse model, its repulsive effect compensating the gravitational attraction, 
but, as we now see, constraints on the expansion of the Universe impose for 
it a very small upper value. 

It is more convenient to work in the specific context of a Friedmann 
universe, with a Robertson- Walker metric: 



ds^ = 



a\t) 



dr^ 

1 — kr“^ 






(2) 



where a{t) is the cosmic scale factor. Implementing energy conservation into 
the Einstein equations then leads to the Friedmann equation which gives an 
expression for the Hubble parameter H : 

© EDP Sciences, Springer- Verlag 2000 
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where, using standard notations, a is the time derivative of the cosmic scale 
factor, p = T^q is the energy density and the term proportional to k is 
a spatial curvature term (see (2)). Note that the cosmological constant 
appears as a constant contribution to the Hubble parameter. 

Evaluating each term of the Friedmann equation at present time al- 
lows for a rapid estimation of an upper limit on A. Indeed, we have Hq = 
ho X 100 km s~^ Mpc“^ with ho of order one, whereas the present energy 
density po is certainly within one order of magnitude of the critical energy 
density pc = 3 Hq / = /ig 2 x 10“^® kg m“®; moreover the spatial cur- 
vature term certainly does not represent presently a dominant contribution 
to the expansion of the Universe. Thus, (3) implies the following constraint 
on A: 

|A| < Hi (4) 

In other words, the length scale £a = associated with the cosmo- 
logical constant must be larger than = /ig ^ x 10^® m, and thus a 

macroscopic distance. 

This is not a problem as long as one remains classical. Indeed, 
provides a natural macroscopic scale for our present Universe. The problem 
arises when one tries to combine gravity with the quantum theory. Indeed, 
from the Newton’s constant and the Planck constant h one can construct a 
mass scale or a length scale 

= 2.4 X 10^® GeV/c^ 

£p = = 8.1 X 10"®® m. 

mpc 

The above constraint now reads: 

£a = |A|-®/2 > T ^ 10®o £p. (5) 

Ho 

In other words, there are more than sixty orders of magnitude between the 
scale associated with the cosmological constant and the scale of quantum 
gravity. 

A rather obvious solution is to take A = 0. This is as valid a choice 
as any other in a pure gravity theory. Unfortunately, it is an unnatural 
one when one introduces any kind of matter. Indeed, set A to zero but 
assume that there is a non- vanishing vacuum (z.e. groundstate) energy: 
{Tfil = —{p)g^ivi then the Einstein equations (1) read 

"^g^vR — 37rG]<iTpii/ SttG^ (^p') Ppii/ . (6) 

The last term is interpreted as an effective cosmological constant: 




Aeff = 87tGn(p) 




( 7 ) 
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Generically (p) receives a non-zero contribution from symmetry breaking: 
for instance, the scale A would be typically of the order of 100 GeV in the 
case of the gauge symmetry breaking of the Standard Model or 1 TeV in 
the case of supersymmetry breaking. But the constraint (5) now reads: 

A < 10-3° TOP ~ 10-3 eV. (8) 

It is this very unnatural fine-tuning of parameters (in explicit cases (p) and 
thus A are functions of the parameters of the theory) that is referred to as 
the cosmological constant problem, or more accurately the vacuum energy 
problem. 

2 The role of supersymmetry 

If the vacuum energy is to be small, it may be easier to have it vanishing 
through some symmetry argument. Global supersymmetry is the obvious 
candidate. Indeed, the supersymmetry algebra 

= (9) 

yields the following relation between the Hamiltonian H = Pq and the 
supersymmetry generators Qr'. 



h=^-J2qi ( 10 ) 

r 

and thus the vacuum energy (0|i?|0) is vanishing if the vacuum is super- 
symmetric (Qr|0 > = 0). 

Unfortunately, supersymmetry has to be broken at some scale since its 
prediction of equal mass for bosons and fermions is not observed in the 
physical spectrum. Then A is of the order of the supersymmetry breaking 
scale, that is a few hundred GeV to a TeV. 

However, the right framework to discuss these issues is supergravity i.e. 
local supersymmetry since locality implies here, through the algebra (9), 
invariance under “local” translations that are the diffeomorphisms of general 
relativity. In this theory, the graviton, described by the linear perturbations 
of the metric tensor p^j^(x), is associated through supersymmetry with a 
spin 3/2 field, the gravitino One may write a supersymmetric invariant 
combination of terms in the action: 



5 = 



dL^Xy/g 3 to | to 3/2 - 



( 11 ) 



where a^i, = If the first term is made to cancel the vacuum en- 

ergy, then the second term is interpreted as a mass term for the gravitino. 
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We thus see that the criterion for spontaneous symmetry breaking changes 
from global supersymmetry {non-vanishing vacuum energy) to local super- 
symmetry or supergravity {non-vanishing gravitino mass). It is somewhat 
a welcome news that a vanishing vacuum energy is not tied to a supersym- 
metric spectrum. On the other hand, we have lost the only rationale that 
we had to explain a zero cosmological constant. 

Let us recall for future use that, in supergravity, the potential for a 
set of scalar fields (/)* is written in terms of the Kahler potential K {</>'', <f>'‘) 
(the normalisation of the scalar field kinetic terms is simply given by the 
Kahler metric = d‘^K/d(j/d(j)^) and of the superpotential IU((/)*), an 
holomorphic function of the fields: 



V = e^/™p 



W, + ^W 

mi. 



, K-, -\ IWP' 

-k -^W - 3^-^ 



mi 



m 



p J 



D terms 



( 12 ) 

where Ki = dK/d(j>^, etc. and it'd is the inverse metric of Obviously, 
the positive definiteness of the global supersymmetry scalar potential is lost 
in supergravity. 



3 Observational results 

Over the last years, there has been an increasing number of indications that 
the Universe is presently undergoing accelerated expansion. This appears 
to be a strong departure from the standard picture of a matter-dominated 
Universe. Indeed, the standard equation for the conservation of energy, 

p=-i{p + p)H, (13) 

allows to derive from the Friedmann equation (3), written in the case of a 
universe dominated by a component with energy density p and pressure p: 

- = ^(d+3p). (14) 

a 3 

Obviously, a matter-dominated {p ~ 0) universe is decelerating. One needs 
instead a component with a negative pressure. 

A cosmological constant is associated with a contribution to the energy- 
momentum tensor as in (6, 7): 

TH = = {-p,p,p,p). (15) 

The associated equation of state is therefore 

p = -p. (16) 

It follows from (14) that a cosmological constant tends to accelerate 
expansion. 
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The discussion of data is thus often expressed in terms of the energy 
density stored in the vacuum versus the energy density pu in matter 
fields (baryons, neutrinos, hidden matter,...). It is customary to normalize 
with the critical density (corresponding to a flat Universe) : 



Ua 



Pc 



Um = 



Pm 



SttGn 



(17) 



The relation 

Um + = 1, (18) 

a prediction of many inflation scenarios, is found to be compatible with 
recent Cosmic Microwave Background measurements [4]^. It is striking that 
independent methods based on the measurement of different observables 
on rich clusters of galaxies all point towards a low value of Um ~ 1/3 [6]: 
mass-to- light ratio, baryon mass to total cluster mass ratio (the total baryon 
density in the Universe being fixed by primordial nucleosynthesis), cluster 
abundance. This necessarily implies a non-vanishing Ua (non-vanishing 
cosmological constant or a similar dynamical component). 

There are indeed some indications going in this direction from several 
types of observational data. One which has been much discussed lately uses 
supernovae of type la as standard candles^. Two groups, the Supernova 
Cosmology Project [7] and the High-z Supernova Search [8] have found 
that distant supernovae appear to be fainter than expected in a flat matter- 
dominated Universe. If this is to have a cosmological origin, this means 
that, at fixed redshift, they are at larger distances than expected in such a 
context and thus that the Universe is accelerating. 

More precisely, the relation between the flux / received on earth and the 
absolute luminosity C of the supernova depends on its redshift z, but also on 
the geometry of spacetime. Traditionally, flux and absolute luminosity are 
expressed on a log scale as apparent magnitude tob and absolute magnitude 
M (magnitude is —2.5 logj^p luminosity -I- constant). The relation then 
reads 

me = 51og(i7odL) + M-51ogi7o + 25. (19) 

The last terms are z-independent, if one assumes that supernovae of type 
la are standard candles] they are then measured by using low z supernovae. 
The first term, which involves the luminosity distance c?l, varies logarith- 
mically with z up to corrections which depend on the geometry. Expanding 



^This follows from the fact that the first acoustic peak is expected at an “angular” 
scale t ~ 200/VftM + S^a [5]. 

^By calibrating them according to the timescale of their brightening and fading. 
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in z one obtains [10]: 



Hodi, = cz 





( 20 ) 



where qo = —adja} is the deceleration parameter. This parameter is easily 
obtained from (14): in a spatially flat Universe with only matter and a 
cosmological constant (c/. (16)), p = pM + and p = — which gives 



qo = Um/2 — Ua- 



(21) 



This allows to put some limit on Ua on the model considered here (see 
Fig. 1). 

Let us note that the combination (21) is “orthogonal” to the combina- 
tion Ut = measured in CMB experiments (see footnote preceding 

page). The two measurements are therefore complementary: this is some- 
times referred to as “cosmic complementarity” . 

Of course, such type of measurement is sensitive to many possible sys- 
tematic effects (evolution besides the light-curve timescale correction, etc.), 
and this has fueled a healthy debate on the significance of present data. This 
obviously requires more statistics and improved quality of spectral measure- 
ments. A particular tricky systematic effect is the possible presence of dust 
that would dimmer supernovae at large redshift. 

Other results come from gravitational lensing. The deviation of light 
rays by an accumulation of matter along the line of sight depends on the 
distance to the source [10] 

and thus on the cosmological parameters Om and ft a- As qo decreases 
{i.e. as the Universe accelerates), there is more volume and more lenses 
between the observer and the object at redshift 2 . Several methods are 
used: abundance of multiply-imaged quasar sources [11], strong lensing by 
massive clusters of galaxies (providing multiple images or arcs) [12], weak 
lensing [13]. 



4 Quintessence 

From the point of view of high energy physics, it is however difficult to imag- 
ine a rationale for a pure cosmological constant, especially if it is nonzero 



®Of course, since supernovae of redshift z ^ 1 are now being observed, an exact 
expression [9] must be used to analyze data. The more transparent form of (20) gives 
the general trend. 
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Fig. 1. Best-fit coincidence regions in the S4 m — plane, based on the analysis 
of 42 type la supernovae discovered by the Supernova Cosmology Project [7]. 



but small compared to the typical fundamental scales (electroweak, strong, 
grand unified or Planck scale). There should be physics associated with 
this form of energy and therefore dynamics. For example, in the context 
of string models, any dimensionful parameter is expressed in terms of the 
fundamental string scale Mg and vacuum expectation values of scalar fields. 
The physics of the cosmological constant is then the physics of the corre- 
sponding scalar fields. 
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Introducing dynamics generally modifies the equation of state (16) to 
the more general form with negative pressure: 

p = wp, w < 0. (23) 

Let us recall that w = 0 corresponds to non-relativistic matter (dust) 
whereas w = 1/3 corresponds to radiation. A network of light, nonin- 
tercommuting topological defects [14,15] on the other hand gives w = —n/3 
where n is the dimension of the defect i.e. 1 for a string and 2 for a do- 
main wall. Finally, the equation of state for a minimally coupled scalar field 
necessarily satisfies the condition w > — 1. 

Experimental data may constrain such a dynamical component just as it 
did with the cosmological constant. For example, in a spatially fiat Universe 
with only matter and an unknown component X with equation of state 
px = wxpx, one obtains from (14) with p = pM + px, P = wxPx the 
following form for the deceleration parameter 

9o = ^ + (1 + 3u;x)^, (24) 

where fix = Px/ Pc- Supernovae results give a constraint on the parameter 
wx as shown in Figure 2. Similarly, gravitational lensing effects are sensitive 
to this new component through (22). 

A particularly interesting candidate in the context of fundamental the- 
ories is the case of a scalar^ field <j) slowly evolving in a runaway potential 
which decreases monotonically to zero as <j) goes to infinity [16-18]. This 
is often referred to as quintessence. This can be extended to the case of a 
very light field (pseudo-Goldstone boson) which is presently relaxing to its 
vacuum state [19]. We will discuss the two situations in turn. 

4.1 Runaway quintessence 

A runaway potential is frequently present in models where supersymmetry 
is dynamically broken. Indeed, supersymmetric theories are characterized 
by a scalar potential with many fiat directions, i.e. directions (j) in field 
space for which the potential vanishes. The corresponding degeneracy is 
lifted through dynamical supersymmetry breaking, that is supersymmetry 
breaking through strong interaction effects. In some instances (dilaton or 
compactification radius), the field expectation value {(j>) actually provides 
the value of the strong interaction coupling. Then at infinite (/> value, the 
coupling effectively goes to zero together with the supersymmetry breaking 



vector field or any field which is not a Lorentz scalar must have settled down to a 
vanshing value. Otherwise, Lorentz invariance would be spontaneously broken. 
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Fig. 2. Best fit coincidence regions in the Hm — wx plane for an additional en- 
ergy density component fix with equation of state wx = pxjpx, assuming fiat 
cosmology (flM + fix = 1); based on the same analysis [7] as in Figure 1. 



effects and the flat direction is restored: the potential decreases monotoni- 
cally to zero as (j) goes to infinity. 

Dynamical supersymmetry breaking scenarios are often advocated be- 
cause they easily yield the large scale hierarchies necessary in grand unified 
or superstring theories in order to explain the smallness of the electroweak 
scale with respect to the fundamental scale. Let us take the example of 
supersymmetry breaking by gaugino condensation in effective superstring 
theories. The value go of the gauge coupling at the string scale Mg is pro- 
vided by the vacuum expectation value of the dilaton held s (taken to be 
dimensionless by dividing by mp) present among the massless string modes: 
9o = If gauge group has a one-loop beta function coefficient b, 

then the running gauge coupling becomes strong at the scale 

A ~ = A^se"®/2^ (25) 

At this scale, the gaugino fields are expected to condense. Through dimen- 
sional analysis, the gaugino condensate {XX) is expected to be of order A^. 
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Terms quadratic in the gaugino fields thus yield in the effective theory below 
condensation scale a potential for the dilaton: 

U~ |(AA)P cxe-3"/^ (26) 



The s-dependence of the potential is of course more complicated and one 
usually looks for stable minima with vanishing cosmological constant. But 
the behavior (26) is characteristic of the large s region and provides a poten- 
tial slopping down to zero at infinity as required in the quintessence solution. 
A similar behavior is observed for moduli fields whose vev describes the ra- 
dius of the compact manifolds which appear from the compactification from 
10 or 11 dimensions to 4 in superstring theories [20]. 

Let us take therefore the example of an exponentially decreasing poten- 
tial. More explicitly, we consider the following action 



S= d:^Xy/g 






(27) 



which describes a real scalar field (j) minimally coupled with gravity and the 
self-interactions of which are described by the potential: 

= (28) 



where Vq is a positive constant. 

The energy density and pressure stored in the scalar field are respec- 
tively: 

- 1^(<(')- (29) 

We will assume that the background (matter and radiation) energy density 
Pb and pressure pb obey a standard equation of state 



Pb=wbPb- (30) 

If one neglects the spatial curvature (fc ~ 0), the equation of motion for (j) 
simply reads 

dU 

<P + 3H<j>=- — , (31) 

with 

+ P^)- (32) 

This can be rewritten as 

P4, = (33) 

We are looking for scaling solutions i.e. solutions where the <f> energy density 
scales as a power of the cosmic scale factor: p^ oc a~'^* or ppj p^ = 
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In this case, one easily obtains from (29) and (33) that the <j) field obeys a 
standard equation of state 





Pcj) — 


(34) 


with 




1 


(35) 


Hence 






(36) 



If one can neglect the background energy pe, then (32) yields a simple 
differential equation for a{t) which is solved as: 



^Cxi2/[3(l+^,#.)]_ (37) 

Since = (l + 'u;^)^^, one deduces that (f> varies logarithmically with time. 
One then easily obtains from (31, 32) that 

</> = </>o + ^ln(V^o) (38) 

and® 

)2 

= y - 1. (39) 

It is clear from (39) that, for A sufficiently small, the field (f) can play the 
role of quintessence. 

But the successes of the standard big-bang scenario indicate that clearly 
P 0 cannot have always dominated: it must have emerged from the back- 
ground energy density pe- Let us thus now consider the case where pe 
dominates. It turns out that the solution just discussed with p^ ^ pe and 
(39) is a late time attractor [21] only if < 3(1 -I- wq). If A^ > 3(1 -I- wb), 
the global attractor is a scaling solution [16,22,23] with the following prop- 
erties®: 



O — P<P 

“0 — , 

P4> + Pb 


= y(l + w^B) 


(40) 


W 4 , 


= Wb- 


(41) 



The second equation (41) clearly indicates that this does not correspond to 
a quintessence solution (23). 

The semi-realistic models discussed earlier tend to give large values of A 
and thus the latter scaling solution as an attractor. For example, in the case 

®Under the condition < 6 {w^ < 1 since V{(f>) > 0). 

®See reference [24] for the case where the scalar field is non-minimally coupled to 
gravity. 
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(26) where the scalar field is the dilaton, A = 3/6 with b = and 

C = 90 for a Ug gauge symmetry down to C = 9 for SU{3). Moreover [23], 
on the observational side, the condition that should be subdominant 
during nucleosynthesis (in the radiation-dominated era) imposes to take 
rather large values of A. Typically requiring ^^/(p^-I-Pb) to be then smaller 
than 0.2 imposes A^ > 20. 

Although not quintessence, such attractor models with a fixed fraction 
as in (40) have interest of their own [23], in particular for structure 
formation if A € [5,6]. It has been proposed recently [25] to make the 
prefactor Vq in (28) a trigonometric function in (j). This allows for some 
modulation around the previous attractor in an approximately oscillatory 
way: could then have been small at the time of nucleosynthesis and 

be much larger at present times. Finally, very recently [26], such models 
have been coupled to a system of a Brans-Dicke field and a dynamical field 
characterizing the cosmological constant, with a diverging kinetic term, to 
provide a relaxation mechanism for the cosmological constant [27]. 

Ways to obtain a quintessence component have been proposed however. 
Let me sketch some of them in turn. 

One is the notion of tracker fielcf [28]. This idea also rests on the 
existence of scaling solutions of the equations of motion which play the role 
of late time attractors, as illustrated above. An example is provided by a 
scalar field described by the action (27) with a potential 



U(</>) = A 



A4+a 



(42) 



with a > 0. In the case where the background density dominates, one 
finds an attractor scaling solution [17,24,29,30] (j) oc a3(i+iDB)/(2-i-a)^ 
^-3a(i+WB) / ( 2 +a) ^ Thus dccreascs at a slower rate than the background 
density (pB cx a“3(i+-iDB)^ tracks it until it becomes of the same order 
at a given value oq . More precisely [20] : 



P4> 



g(2 + a) /^\ 3(i+»B)/(2+a) 

3(1 -I- wb) V«Q/ 



A4+a 

A 

mp 



a 

aq 



— 3q:(1+iob)/ (2+Q:) 



(43) 

(44) 



One finds 



Wtj, = -1 + 



q;(1 -I- Wb) 
2 -t“ G; 



(45) 



^Somewhat of a misnomer since in this solution, as we see below, the field (j> energy 
density tracks the radiation-matter energy density before overcoming it (in contradistinc- 
tion with (40)). One should rather describe it as a transient tracker field. 





P. Binetruy: Cosmological Constant vs. Quintessence 



411 



Shortly after (p has reached for a = qq a value of order mp, it satisfies the 
standard slowroll conditions: 

mp\V'/V\ < 1, (46) 

ml\V”/V\ < 1, (47) 

and therefore (45) provides a good approximation to the present value of 
Wrf,. Thus, at the end of the matter-dominated era, this field may provide 
the quintessence component that we are looking for. 

Two features are particularly interesting in this respect. One is that this 
scaling solution is reached for rather general initial conditions, i.e. whether 
Prf, starts of the same order or much smaller than the background energy 
density [28]. The second deals with the central question in this game: why is 
the (j) energy density (or in the case of a cosmological constant, the vacuum 
energy density) emerging now? Since p is of order mp in this scenario, it 
can be rephrased here into the following: why is P(mp) of the order of the 
critical energy density pc? Using (44), this amounts to a constraint on the 
parameters of the theory: 



A ~ (48) 

For example, this gives for a = 2, A ~ 10 MeV, not such an unnatural 
value. 

Let us note here the key difference between this tracking scenario and 
the preceding one®. Whereas the exponential potential model accounts for 
a fixed fraction in the attractor solution (and thus ^ is a tracker in the 
strict sense), the final attractor in the tracker field solution corresponds to 
scalar field dominance (U,^ ~ 1). It is the scale A which allows to regulate 
the time at which the scalar field starts to emerge and makes it coincide 
with present time. The welcome property is that the required value for A 
falls in a reasonable range from a high energy physics point of view. On the 
other hand, we will see below that the fact that the present value for p is 
of order mp is a source of problems. 

Models of dynamical supersymmetry breaking easily provide a model 
of the type just discussed [20] . Let us consider supersymmetric QCD with 
gauge group SU{Nc) and N{ < Nc flavors, i.e. Nf quarks Q* (resp. an- 
tiquarks Qi), i = 1- ■ • Nf, in the fundamental Nc (resp. anti-fundamental 
Nc) of SU{Nc). At the scale of dynamical symmetry breaking A where 



I wish to thank M. Joyce for discussions on this point. 
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the gauge coupling becomes strong®, boundstates of the meson type form: 
Wj = Q^Qj. The dynamics is described by a superpotential which can be 
computed non-perturbatively using standard methods: 



W={N,- Nf) 



J^(3N^-Ni)/{N^-Nf) 

(det ■ 



(49) 



Such a superpotential has been used in the past but with the addition of a 
mass or interaction term {i.e. a positive power of II) in order to stabilize the 
condensate. One does not wish to do that here if II is to be interpreted as a 
runaway quintessence component. For illustration purpose, let us consider 
a condensate diagonal in flavor space: Wj = (see [31] for a more 
complete analysis). Then the potential for (j) has the form (42), with a = 
2{N, + Nf)/{N,- Nf). Thus, 

N Nt 

= + (50) 



which clearly indicates that the meson condensate is a potential candidate 
for a quintessence component. 

Another possibility for the emergence of the quintessence component 
out of the background energy density might be attributed to the presence 
of a local minimum (a “bump”) in the potential V{(j)): when the held (j) 
approaches it, it slows down and decreases more slowly (n^ being much 
smaller as temporarily becomes closer to —1, cf. (35)). If the parameters 
of the potential are chosen carefully enough, this allows the background 
energy density, which scales as to become subdominant. The 

value of (j) at the local minimum provides the scale which allows to regulate 
the time at which this happens. This approach can be traced back to the 
earlier work of Wtterich [16] and has recently been advocated by Albrecht 
and Skordis [32] in the context of an exponential potential. They argue 
quite sensibly that, in a “realistic” string model, Vq in (28) is ^dependent: 
Vo{(f>). This new held dependence might be such as to generate a bump in 
the scalar potential and thus a local minimum. Since 



_ Vq'((?^) _ ^ 
V d4> Vo{(l>) mp ’ 



it suffices that mpUp ((())/ Vb(</>) becomes temporarily larger than A in or- 
der to slowdown the redshift of p^: once p^ dominates, an attractor scal- 
ing solution of the type (38, 39) is within reach, if A is not too large. 



®It is given by an expression such as (25) where go is the value of the gauge coupling at 
the large scale Ms and b the one- loop beta function coefficient for gauge group SU{Nc). 
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As pointed out by Albrecht and Skordis, the success of this scheme does 
not require very small couplings. 

One may note that, in the preceding model, one could arrange the local 
minimum in such a way as to completely stop the scalar field, allowing 
for a period of true inflation [33]. The last possibility that I will discuss 
goes in this direction one step further. It is known under several names: 
deflation [34], kination [35,36], quintessential inflation [37]. It is based on 
the remark that, if a held (j) is to provide a dynamical cosmological constant 
under the form of quintessence, it is a good candidate to account for an 
inflationary era where the evolution is dominated by the vacuum energy. 
In other words, are the quintessence component and the inflaton the same 
unique held? 

In this kind of scenario, inflation (where the energy density of the 
Universe is dominated by the </> held potential energy) is followed by re- 
heating where matter-radiation is created by gravitational coupling during 
an era where the evolution is driven by the (p held kinetic energy (which 
decreases as a“®). Since matter-radiation energy density is decreasing more 
slowly, this turns into a radiation-dominated era until the (p energy density 
eventually emerges as in the quintessence scenarios described above. 

Finally, it is worth mentioning that, even though the models discussed 
above all have > —1, models with lower values of may easily be 
constructed. One may cite models with non-normalized scalar held kinetic 
terms [38], or simply models with non-minimally coupled scalar fields [39]. 
Indeed, it has been argued by Caldwell [40] that such a “phantom” energy 
density component fits well the present observational data. 

4.2 Pseudo-Goldstone boson 

There exists a class of models [19] very close in spirit to the case of runaway 
quintessence: they correspond to a situation where a scalar held has not yet 
reached its stable groundstate and is still evolving in its potential. 

More specifically, let us consider a potential of the form: 

V{<P) = M\ , (52) 

where M is the overall scale, / is the vacuum expectation value {p) and 
the function v is expected to have coefficients of order one. If we want the 
potential energy of the held (assumed to be close to its vev /) to give a 
substantial fraction of the energy density at present time, we must set 

~ Pc ~ H^ml. (53) 

However, requiring that the evolution of the held p around its minimum has 
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been overdamped by the expansion of the Universe until recently imposes 

ml = iu"(/) ~ ^ < Hi (54) 

Let us note that this is one of the slowroll conditions familiar to the inflation 
scenarios. 

From (53) and (54), we conclude that / is of order mp (as the value of the 
held 4> in runaway quintessence) and that M ~ 10“^ eV (not surprisingly, 
this is the scale A typical of the cosmological constant, see (8)). The held 
(f) must be very light: ~ ho x 10“®° mp ~ ho x 10“°® eV. Such a small 

value is only natural in the context of an approximate symmetry: the held 
(f) is then a pseudo-Goldstone boson. 

A typical example of such a held is provided by the axion held (QCD 
axion [41] or string axion [42]). In this case, the potential simply reads: 

V{<j>) =M^[l + cos(0//)] . (55) 

5 Quintessential problems 

However appealing, the quintessence idea is difficult to implement in the 
context of realistic models [43,44]. The main problem lies in the fact that 
the quintessence held must be extremely weakly coupled to ordinary matter. 
This problem can take several forms: 

• we have assumed until now that the quintessence potential monoton- 
ically decreases to zero at infinity. In realistic cases, this is difficult to 
achieve because the couplings of the held to ordinary matter generate 
higher order corrections that are increasing with larger held values, 
unless forbidden by a symmetry argument. For example, in the case 
of the potential (42), the generation of a correction term Ad 
puts in jeopardy the slowroll constraints on the quintessence held, un- 
less very stringent constraints are imposed on the coupling Ad. But 
one typically expects from supersymmetry breaking Ad ~ M^/nip 
where Ms is the supersymmetry breaking scale [44] . 

Similarly, because the vev of (j) is of order mp, one must take into 
account the full supergravity corrections. One may then argue [45] 
that this could put in jeopardy the positive definiteness of the scalar 
potential, a key property of the quintessence potential. This may point 
towards models where (W) = 0 (but not its derivatives, see (12)) or to 
no-scale type models: in the latter case, the presence of 3 moduli fields 
T* with Kahler potential K = — ln(T* -|- T *) cancels the negative 
contribution — 3|1U|^ in (12)^°. 



^^Moreover, supergravity corrections may modify some of the results. For example, the 
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• the quintessence field must be very light [43]. If we return to our 
example of supersymmetric QCD in (42), V"(mp) provides an order 
of magnitude for the mass-squared of the quintessence component: 

/ A \ 

m^r^Ai—j 10”^^ eV. (56) 

using (48). Similarly, we have seen that the mass of a pseudo- 
Coldstone boson that could play the role of quintessence is typically 
of the same order. This field must therefore be very weakly coupled to 
matter; otherwise its exchange would generate observable long range 
forces. E6tv6s-type experiments put very severe constraints on such 
couplings. 

Again, for the case of supersymmetric QCD, higher order corrections 
to the Kahler potential of the type 



K{(pz,(pl) 





(57) 



will generate couplings of order 1 to the standard matter fields 4>i, (j)j 
since {Q) is of order mp. In order to alleviate this problem, Masiero 
et al. [31] have proposed a solution much in the spirit of the least cou- 
pling principle of Damour and Polyakov [46]: the different functions 
f3ij have a common minimum close to the value (Q), which is most eas- 
ily obtained by assuming “flavor” independence of the functions Pij. 
This obviously eases the Eotvos experiment constraints. In the early 
stages of the evolution of the Universe, when Q ^ mp, couplings of 
the type (57) generate a contribution to the mass of the Q field which, 
being proportional to H, does not spoil the tracker solution. 



• It is difficult to find a symmetry that would prevent any coupling 
of the form F^i, to the gauge field kinetic term. Since 

the quintessence behavior is associated with time-dependent values 
of the field of order mp, this would generate, in the absence of fine 
tuning, corrections of order one to the gauge coupling. But the time 
dependence of the fine structure constant for example is very strongly 
constrained [47]: |d/a| < 5 x 10“^^ yr“^. This yields a limit [43]: 



|/3| < 10"® 



mpiJo 

~w 



(58) 



presence of a (flat) Kahler potential in (12) induces exponential field-dependent factors. 
A more adequate form for the inverse power law potential (42) is thus [45] V{4>) = 
Ae^ /2Mp ^4-ha^^o' exponential factor is not expected to change much the late 

time evolution of the quintessence energy density. Brax and Martin [45] argue that it 
changes the equation of state through the value of w^f,. 
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where (0) is the average over the last 2 x 10® years. 

A possible solution is to implement an approximate continuous sym- 
metry of the type: 0 ^ <f>+ constant [43]. This symmetry must be 
approximate since it must allow for a potential U(0). Such a symme- 
try would only allow derivative couplings, an example of which is an 
axion-type coupling /3(0/mp)F''^F^,y. If is the color SU{3) field 
strength, QCD instantons yield a mass of order /3A®^j^/mp, much too 
large to satisfy the preceding constraint. In any case, supersymmetry 
would relate such a coupling to the coupling 0(0/mp)F''^F^,y that we 
started out to forbid. 

The very light mass of the quintessence component points towards scalar- 
tensor theories of gravity, where such a dilaton-type (Brans-Dicke) scalar 
field is found. This has triggered some recent interest for this type of 
theories. Attractor scaling solutions have been found for non-minimally 
coupled fields [24, 48] . However, as discussed above, one problem is that 
scalar-tensor theories lead to time-varying constants of nature. One may 
either put some limit on the couplings of the scalar field [49] or use the at- 
tractor mechanism towards General Relativity that was found by Damour 
and Nordtvedt [46,50]. This mechanism exploits the stabilisation of the 
dilaton-type scalar through its conformal coupling to matter. Indeed, as- 
suming that this scalar field 0 couples to matter through an action term 

(0’m, /(0)ff/ii/))> then its equation of motion takes the form: 

o ^ ,,2 ^" + (^ ~ ^b)0' = -(1 - 3wb) , (59) 

3 - 0' d0 

where 0' = d0/dlna. This equation can be interpreted [50] as the motion 
of a particle of velocity-dependent mass 2/(3 — 0' ) subject to a damping 
force (1 — wb)4>' in an external force deriving from a potential Ueff(0) = 
(1 — 3wb) In /(0) . If this effective potential has a minimum, the field quickly 
settles there. Bartolo and Pietroni [51] have recently proposed a model 
of quintessence (they add a potential U(0)) using this mechanism: the 
quintessence component is first attracted to General Relativity and then to 
a standard tracker solution. 

Scalar-tensor theories of gravity naturally arise in the context of higher- 
dimensional theories and we will return to such scenarios in the next section 
where we discuss these theories. 

All the preceding shows that there is extreme fine tuning in the couplings 
of the quintessence field to matter, unless they are forbidden by some sym- 
metry. This is somewhat reminiscent of the fine tuning associated with the 
cosmological constant. In fact, the quintessence solution does not claim to 
solve the cosmological constant (vacuum energy) problem described above. 
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If we take the example of a supersymmetric theory, the dynamical cosmo- 
logical constant provided by the quintessence component clearly does not 
provide enough amount of supersymmetry breaking to account for the mass 
difference between scalars (sfermions) and fermions (quarks and leptons): 
at least 100 GeV. There must be other sources of supersymmetry breaking 
and one must fine tune the parameters of the theory in order not to generate 
a vacuum energy that would completely drown 

In any case, the quintessence solution shows that, once this fundamental 
problem is solved, one can find explicit fundamental models that effectively 
provide the small amount of cosmological constant that seems required by 
experimental data. 

6 Extra spacetime dimensions 

The old idea of Kaluza and Klein about compact extra dimensions has 
received a new twist with the realisation, motivated by string theory [52], 
that such extra dimensions may only be felt by gravitational interactions 
[53]. In other words, our 4-dimensional world of quarks, leptons and gauge 
interactions may constitute a hypersurface in a higher-dimensional Universe. 
Such a hypersurface is called a brane in modern jargon: certain types of 
branes (Dirichlet branes) appear as solitons in open string theories [54]. 
In what follows, we will mainly consider 4-dimensional branes to which 
are confined observable matter as well as standard non-gravitational gauge 
interactions. The part of the Universe which is not confined to the brane is 
called the bulk (which for simplicity we will take to be 5-dimensional). 

In this framework, the very notion of a cosmological constant takes a 
new meaning and there has been recently a lot of activity to try to unravel 
it. The hope is that the cosmological constant problem itself may receive a 
different formulation, easier to deal with. 

If we think of the cosmological constant as some vacuum energy, one has 
the choice to add it to the brane or to the bulk. The consequences are quite 
different: 

• If we introduce a vacuum energy Ab > 0 on the brane, it creates a 
repulsive gravitational force outside {i.e. in the bulk). Indeed, a result 
originally obtained by Ipser and Sikivie [55] in the case of a domain 
wall may be adapted here as follows: let p and p be the pressure 
and energy density on the brane, then if 2p + 2>p is positive (resp. 
negative), a test body may remain in the bulk stationary to the brane 
if it accelerates away from (resp. towards) the brane. In the case of a 
positive cosmological constant, p = —p = Ab and 2p + 3p = — 2Ab < 0. 

Projected back to our 4-dimensional brane-world, this yields a differ- 
ent behaviour from the one seen in a standard 4-dimensional world. 




418 



The Primordial Universe 



For example, the vacuum energy contributes to the Hubble parame- 
ter describing the expansion of the brane world in a (non-standard) 
quadratic way [56]: = A^/(36M®) -!-•••, where M is the funda- 

mental 5-dimensional scale. 

• If we introduce a vacuum energy Ab in the 5-dimensional bulk (this 
Ab is then of mass dimension 5), this will induce a potential for the 
modulus field whose vev measures the radius of the compact dimen- 
sion. Let us call for simplicity R this modulus, which is often referred 
to as the radion. Then in the case of a single compact dimension, 
V{R) = XbR [57]. 

The contribution of this bulk vacuum energy to the square of the 
Hubble parameter on the brane is standard (linear): = Ab/(6M^) 

H 



Allowing both types of vacuum energies allows to construct static solutions 
with a cancelling effect in the bulk. Indeed, if one imposes the condition: 



Ab 



6M3’ 



(60) 



the effective 4-dimensional cosmological constant, i.e. the constant term in 
the Hubble parameter H, vanishes. 

A striking property of this type of configuration is that it allows to local- 
ize gravity on the brane. This is the so-called Randall-Sundrum scenario [58] 
(see also [59] for earlier works). The 5-dimensional Einstein equations are 
found to allow for a 4-dimensionally flat solution with a warp factor {i.e. 
an overall fifth dimension-dependent factor in front of the four-dimensional 
metric): 

dfi2 = -k dy^ (61) 

if the condition (60) is satisfied. Let us note that this condition ensures that 
the bulk is anti-de Sitter since Ab < 0. If Ab > 0, one finds a single nor- 
malisable massless mode of the metric which is interpreted as the massless 
4-dimensional graviton. The wave function of this mode turns out to be lo- 
calized close to the brane, which gives an explicit realisation of 4-dimensional 
gravity trapping. There is also a continuum of non-normalisable massive 
modes (starting from zero mass) which are interpreted as the Kaluza-Klein 
graviton modes. 

Of course, the Randall-Sundrum condition (60) is another version of the 
standard fine tuning associated with the cosmological constant. One would 
like to find a dynamical justification to it. 

Some progress has recently been made in this direction [60,61]. The pres- 
ence of a scalar field in the bulk, conformally coupled to the matter on the 
brane allows for some relaxation mechanism that screens the 4-dimensional 
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cosmological constant from corrections to the brane vacuum energy. Let 
us indeed consider such a scalar field, of the type discussed above in the 
context of scalar-tensor theories. The action is of the following form: 



5 = 



d^x\/g^ 






(V^m, dfiuf i'P)) 



(62) 



where the fields ipm are matter fields localized on the brane, located at 
y = 0, and is the 4-dimensional metric (N are 5-dimensional indices 
whereas /i, v are 4-dimensional indices). We will be mostly interested in 
the 4-dimensional vacuum energy so that we can write the 4-dimensional 
matter action as: 

S„, = - J (63) 

Five-dimensional Einstein equations projected on the brane, provide the 
following Friedmann equation: 



= 



18 M 6 ' 






(64) 



The other equations, including the (f> equation of motion, ensure that this 
vanishes, irrespective of the precise value of Ab, for the following metric: 

dfi2 = -f dy^, (65) 



where the derivative of the function a(y) with respect to y is fixed on the 
brane by junction conditions (assuming a symmetry y ^ — y) 

a'(0) = (66) 

In other words, the cosmological constant is, to a first order, not sensitive 
to the corrections to the vacuum energy coming from the Standard Model 
interactions. 

For specific values of the potential, such a dynamics localizes the gravity 
around the brane. For example, with vanishing potential, the solution of 
the equations is obtained for 



/(^) (67) 

One obtains a flat 4-dimensional spacetime (indeed, in this case, this is the 
unique solution [60]) although the vacuum energy may receive non- vanishing 
corrections. 

The price to pay is the presence of a singularity close to the brane. It 
remains to be seen what is the interpretation of this singularity, how it 
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should be treated and whether this reintroduces fine tuning [62-64] . Also a 
full cosmological treatment, z.e. including time dependence, is needed. 

Presumably, supersymmetry plays an important role in this game if one 
wants to deal with stable solutions. Supersymmetry indeed may prove to be 
in the end the rationale for the vanishing of the cosmological constant. The 
picture that would emerge would be one of a supersymmetric bulk with van- 
ishing cosmological constant and with supersymmetry broken on the brane 
(remember that supersymmetry is related to translational invariance) [65]. 
Models along these lines have been discussed recently by Gregory, Rubakov 
and Sibiryakov [66]: the four-dimensional gravity is localized on the brane 
due to the existence of an unstable graviton boundstate. Presumably in 
such models one does not recover the standard theory of gravity. 

7 Conclusion 

The models discussed above are many. This is not a surprise since the cos- 
mological constant problem, although it has attracted theorists for decades, 
has not received yet a convincing treatment. What is new is that one expects 
in a not too distant future a large and diversified amount of observational 
data that should allow to discreminate among these models. One may men- 
tion the MAP and PLANCK satellites on the side of CMB measurements. 
The SNAP mission should provide, on the other hand, large numbers of type 
la supernovae which should allow a better handle on this type of measure- 
ments and a significant increase in precision. But other methods will also 
give complementary information: lensing, galaxy counts [67], gravitational 
wave detection [39,68], ... 
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GRAVITINO PRODUCTION 
AND SUPER-HIGGS EFFECT IN COSMOLOGY 



R. Kallosh 



Abstract 

In this lecture we discuss some recent work on the effects which 
gravitinos may play in cosmology. Of particular interest is the non- 
thermal generation of gravitinos after inflation. We explain the super- 
Higgs effect in cosmology which gives a conceptual basis for the 
understanding of the production of helicity ±| gravitino. 

The purpose of these lecture is to provide the basic concepts which allow 
to investigate the production of gravitinos in a cosmological background. If 
supersymmetric particles exist in nature, one expects that also gravitino (a 
fermionic partner of the graviton) exists. 

During the last year there was a number of investigations of various 
aspects of this problem [1-6]. The basic results of these studies will be 
discussed here. 

1 Introduction 

The useful reviews on general supergravity theory [7], describing gravitino, 
can be found in [8-10]. Gravitinos can be produced during preheating after 
inflation due to a combined effect of interactions with an oscillating inflaton 
held and absence of conformal invariance. It was found that in general the 
probability of the helicity ±1/2 gravitino production is not suppressed by 
the small gravitational coupling [1] . In theories with spontaneously broken 
supersymmetry this may lead to a copious production of gravitinos after 
inflation. Efficiency of the new non-thermal mechanism of gravitino pro- 
duction is very sensitive to the choice of the underlying theory. This may 
put strong constraints on certain classes of inflationary models. 

The theory of the cosmological gravitino production is very compli- 
cated. The massless gravitino has only helicity ±3/2 states. However if the 
super-Higgs mechanism with the spontaneously broken supersymmetry takes 
place and gravitino acquires a mass, it acquires in addition also helicity 
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±1/2 states, goldstino is eaten by gravitino and it becomes a helicity ±1/2 
state of gravitino. This is analogous to the well known Higgs mechanism in 
the standard model: the vector meson is massless when gauge symmetry is 
unbroken and has only 2 transverse physical states. When the gauge symme- 
try is spontaneously broken, the vector meson acquires a mass. In addition 
to 2 states of a massless vector there is one more longitudinal state: it is a 
Goldstone boson eaten by the vector field. 

In this lecture we will discuss the production of all gravitino components, 
transversal (helicity 3/2) and longitudinal (helicity 1/2). As we will show, 
the production rate of the longitudinal gravitino component can be much 
greater than that of the transversal gravitino components^. 

Since gravitino is a part of the gravitational multiplet, one could expect 
that their production must be strongly suppressed by the small gravita- 
tional coupling. Indeed, usually the production of particles occurs because 
their effective mass changes non-adiabatically during the oscillations of the 
inflaton field [11]. This is the main effect responsible for the production of 
the gravitinos with helicity 3/2. The gravitino mass m at small values of 
the inflaton field (j) is proportional to Mp'^W, where IT is a superpotential. 
Thus the amplitude of the oscillations of the gravitino mass is suppressed 
by Mp^ . That is why production of the gravitino components with helicity 
3/2 is relatively inefficient [12]. It was found there that as in the case of the 
Dirac particles the scale of the production of helicity 3/2 gravitino is set by 
the gravitino mass. This feature can be explained by the fact discovered 
in [1] that in general the massless helicity 3/2 gravitino is conformally cou- 
pled to the metric. The violation of conformal invariance of these modes is 
due only to the mass term and therefore for the small gravitino mass there 
is no production of the helicity 3/2 gravitino in a conformally flat FRW 
background of the early universe. 

In general, the production of particles in FRW background is related to 
breaking of conformal invariance [14]. It is well known, for example, that 
expansion of the universe does not lead to production of massless vector 
particles and massless fermions of spin 1/2 because the theory of such parti- 
cles is conformally invariant and the Friedmann universe is conformally flat. 
Meanwhile, massless scalar particles minimally coupled to gravity (as well as 
gravitons) are created in an expanding universe because the theory describ- 
ing these particles is not conformally invariant. The rate of scalar particle 
production is determined by the Hubble constant H = ^ = 
where p is the energy density. If similar effects are possible for gravitinos. 



^In flat spacetime, transversal components, which exist even if supersymmetry is un- 
broken, correspond to the helicity 3/2, while longitudinal components with non- vanishing 
bo and 7*bi correspond to the helicity 1/2. Although the helicity concept has a less pre- 
cise meaning in FRW metrics, we still will use this loose definition as a shortcut to the 
transversal and longitudinal components. 
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they could be much stronger than the effects of thermal production of grav- 
itinos, studied in the 80’s . Indeed, H is suppressed only by the first degree 
of Mp As a result H typically is much greater than m after inflation, so 
its time-dependence may lead to a more efficient particle production. 

The issue of conformal invariance of gravitinos is rather nontrivial, and 
until recently it has not been thoroughly examined. For gravitinos with 
helicity 3/2 the effects proportional to H do not appear, and therefore the 
violation of conformal invariance for such particles is very small, being pro- 
portional to their mass [12]. However, it was established in [1,6] that the 
theory of gravitinos with helicity 1/2 is not conformally invariant, and there- 
fore such particles will be produced during the expansion of the universe. 

But the most surprising effect which was found is that the production 
of the gravitinos of helicity 1/2 by the oscillating scalar field 4> in general is 
not suppressed by any powers of Mp^, and therefore their production can 
be very efficient. The magnitude of the effect is model-dependent. This 
result may have important cosmological implications, since it may allow to 
rule out certain classes of cosmological theories. 

2 Super-Higgs effect in cosmology 

The concept of “goldstino” was introduced and studied mostly in the context 
of the static backgrounds where scalar fields are constant. For cosmology 
where the scalar fields and the metric depend on time strongly the goldstino 
field was not defined until now. In this part of the lecture we introduce the 
cosmological goldstino, following [6]. 

We consider here the action of d = 4 = 1 supergravity coupled to some 

number of chiral multiplets. The theory is completely defined by the choice 
of the holomorphic superpotential W (4)®) and of that of the Kahler potential 
7^(4)*, 4>i). The fields in the supergravity multiplet include the metric 
and the spin- vector gravitino The fields in the chiral multiplet include 
the complex scalars 4>® and their complex conjugate 4>i as well as spin one 
half fields \i which are left handed and y* which are right-handed. Ignoring 
terms which are quartic in fermions the supergravity action is (the details 
can be found in [6]) 

-9t^ [Xj Vx" + X" VXj] - m^^X^Xj - (2-1) 

Here 

, RltJ.i’u] = -k iw[^“'’(e)7a6^ (2.2) 
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and W[^“*'(e) is a metric compatible spin connection. The potential V de- 
pending on scalar fields is given by the well known formula: 

U = e*^ [SMp'^WW* + (V^W)g-\^(VjW*)] , (2.3) 

where T>^W is the Kahler covariant derivative of the superpotential and 
<f>i) is the Kahler metric of the moduli space. The gravitino mass and 
the mass matrix for the chiral fermions are functions of the superpotential 
and its covariant derivatives: 

TO = Mp^e^/^IU 

mV = (2.4) 

The terms in the action Crnix, which show that the gravitino and chiral 
fermions are mixed, can be written in two different ways: 

= Mp + ^-c) 

= + + (2.5) 

Here we have defined the left-handed part of the goldstino field t>L as con- 
sisting of two parts, and 

VL = vl+ vl 

Vl = m’-Xz, vl=^^iX’gj\ ( 2 . 6 ) 

and where we have introduced notations similar to (2.4): 

TO* = (2.7) 

The action has local supersymmetry, so one should fix the gauge. One 
possibility is to use the gauge where the “goldstino” vanishes: 

r; = 0. (2.8) 

The eliminated fermion is the mode which is eaten by the gravitino to 
make it massive. Notice that is the goldstino which was already con- 
sidered in [7] where the super-Higgs mechanism was studied in the every- 
where constant background. The backgrounds in cosmology may have some 
non- vanishing time derivatives on scalar fields. Therefore in the context 
of cosmology it is natural to change the choice of the goldstino such that 
also the derivative mixing^ terms between the gravitino and the fermions 
are taken into account. However, in (2.5) we can never (in case of more 



^We will assume no vector fields in the background, therefore gaugino and gravitino 
are mixed only in the term (2.5). 
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than one chiral multiplet) eliminate all the derivative terms by a choice of 
the goldstino. The sum which we consider here is preferred because of the 
supersymmetry transformation property 

-\{y + ie^Mp^WW*) CL. (2.9) 

The quantity in brackets is the positive definite part of the potential. When 
the scalars depend only on time, as we will assume in the cosmological 
models, the first term is the kinetic energy g'^j and 

has the same sign as the second term. Thus the variation is non-zero, and 
therefore r; is a good choice for the goldstino. 

In what follows we concentrate on the simplest models with one chiral 
multiplet 4> and arbitrary superpotential. The expression for the goldstino 
simplifies since we have only one type of chiral fermion so that: 

V = m^xi + mix^+ 0<^ix^g0+ 0<^^Xigi^ = 0 , ( 2 - 10 ) 

can be satisfied by totally removing the chiral fermion from the theory 

t> = 0 ^ x = Xi + X^ = 0. (2.11) 

In general case with many chiral multiplets only one combination of them 
can be gauged away, this will be considered in [6] . 

We will present classical equations of motion and constraints for the 
transverse and longitudinal gravitino in the expanding Friedmann universe 
interacting with the moving inflaton field. We will use the gauge where the 
goldstino is absent. Then we will solve classical equations for gravitino. 

We represent equations describing gravitino components with helicities 
3/2 and 1/2 in a form analogous to the equations for the usual spin 1/2 
fermions with time-dependent mass. This allows to reduce, to a certain 
extent, the problem of gravitino production to the problem of production 
of particles with spin 1/2 after preheating [15]. 

3 Gravitino equations in one chiral multiplet case 

In general background metrics in the presence of complex scalar fields with 
non-vanishing VEV’s, the starting equation for the gravitino has in the left 
hand side the kinetic part and a rather lengthy right hand side which will 
be given in [6]. Apart of varying gravitino mass m = Mp‘^e~ W, the right 
hand side contains various mixing terms For a self-consistent setting of the 
problem, the gravitino equation should be supplemented by the equations 
for the fields mixing with gravitino, as well as by the equations determining 
the gravitational background and the evolution of the scalar fields. 

Let us make some simplifications. We consider the supergravity multi- 
plet and a single chiral multiplet containing a complex scalar field with a 
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superpotential W and a single chiral fermion x- This is a simple non-trivial 
extension which allows to study gravitino in the non-trivial FRW cosmolog- 
ical metric supported by the scalar field. A nice feature of this model is that 
the chiral fermion y can be gauged to zero as explained above so that the 
mixing between and y in is absent. We also can choose the non- vanishing 
VEV of the scalar field in the real direction, Red) = Imd* = 0. The 

field 4> plays the role of the inflaton field^. Then from (2.1) we can obtain 
the master equation for the gravitino field 

( TTl \ 

(3.1) 

Gravitino equation (3.1) is a curved spacetime generalization of the familiar 
gravitino equation = 0 in a flat metric, where toq is a constant 

gravitino mass. 

The generalization of the constraint equations ^ = 0 

reads as 

= 0 , (3.2) 

(3-3) 



where G^i, is the Einstein tensor and where ' stands for 

a conformal time derivative dr). It is important that the covariant derivative 
in these equations must include both the spin connection and the Christoffel 
symbols, otherwise equation 'D^Xi' = 0 used for the derivation of these 
equations is not valid. 

The last equation will be especially important for us. Naively, one could 
expect that in the limit Mp ^ oo, gravitinos should completely decouple 
from the background. However, this equation implies that this is not the 
case for the gravitinos with helicity 1/2. Indeed, from (3.3) one can find an 
algebraic relation between and 

7°V’o = A7Vi- (3.4) 



Here A is a matrix which will play a crucial role in our description of the 
interaction of gravitino with the varying background fields. If p and p 
are the background energy-density and pressure, we have Gg = 

G\. = —Mp'^pSl, and one can represent the matrix A as follows: 



p—irri^Mp 2m' a ^Mp 

p + 3m^Mp p + 2,m?Mp 



^1 + 70 ^ 2 - 



(3.5) 



^Typical time evolution of the homogeneous inflaton fleld starts with the regime of 
inflation when 0 slowly rolls down. One can construct a superpotential W which provides 
chaotic inflation for 0 > Mp. When 0(f) drops below ~ Mp, it begins to oscillate 
coherently around the minimum of its the effective potential V{4>)- 
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Note that in the limit Mp ^ oo with fixed <j) in z = 4)Mp ^ , one has 
m = Mp^W. If W does not blow up in this limit, this matrix A is given by 



i P 2W 
A = - + 70 , 



(3.6) 



where ' stands for derivative dt, and the relation between physical and 
conformal times is given by dt = a{rj)dri. In the limit of fiat case without 
moving scalars, A = —1. 

For definiteness, we will consider the minimal Kahler potential K = 
zz* = . In the models where the energy-momentum tensor is determined 

by the energy of a classical scalar field and 4> depends only on time we have 



P=|4>|2 + 1/, p=|4,|2_^_ 



(3.7) 



Then in (3.5) the combination A = Ai + iA 2 emerges. For a single chiral 
multiplet we obtain |^| = 1. (One can show that |^| = 1 for the theories 
with one chiral multiplet even if the Kahler is not minimal.) Therefore A 
can be represented as^ 



A = — exp ^2i J dt /r(? 7 )^ . 



(3.8) 



Using the Einstein equations, one obtains for fx (for minimal Kahler poten- 
tial, and real scalar field): 









mi 



mi=m’ = , 



H = aa~^ = a'a~^. (3.9) 



The expression for /i becomes much simpler and its interpretation is more 
transparent if the amplitude of oscillations of the field 4) is much smaller 
than Mp. In the limit 4>/Mp ^ 0 one has 






(3.10) 



This coincides with the mass of both fields of the chiral multiplet (the scalar 
field and spin 1/2 fermion) in rigid supersymmetry. When supersymmetry is 
spontaneously broken, the chiral fermion, goldstino, is “eaten” by gravitino 
which becomes massive and acquires helicity ±1/2 states in addition to 
helicity ±3/2 states of the massless gravitino. 



^Initial conditions at inflation at r; — > — oo correspond to p = —p, m' = 0 and A = — 1, 
which gives /r(— oo) = 0. Alternatively, we can start with inflaton oscillations at ?7 = 0, 
which defines the phase up to some constant. The final results depend only on p. 
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It is most important to understand that the matrix A does not become 
constant in the limit Mp oo. The phase (3.8) rotates when the back- 
ground scalar field oscillates. The amplitude and sign of A change two times 
within each oscillation. Consequently, the relation between and 
also oscillates during the field oscillations. This means that the gravitino 
with helicity 1/2 (which is related to i/o) remains coupled to the changing 
background even in the limit Mp ^ oo. In a sense, the gravitino with he- 
licity 1/2 remembers its goldstino nature. This is the main reason why the 
gravitino production in this background in general is not suppressed by the 
gravitational coupling. The main dynamical quantity which is responsible 
for the gravitino production in this scenario will not be the small changing 
gravitino mass m(t), but the mass of the chiral multiplet /i, which is much 
larger than m. As we will see, this leads to efficient production of gravitinos 
in the models where the mass of the “goldstino” non-adiabatically changes 
with time. 

We shall solve the master equation (3.1) using the constraint equations 
in the form (3.3) and (3.2). We use plane-wave ansatz ~ for the 
space-dependent part. Then ipi can be decomposed® [13] into its transverse 
part ipj , the trace and the trace k • ip: 

ipz = tpj + Qq* - ^ki{k ■ 7 )^ + (^^ki - iyi(k • 7 )^ k • tp, (3.11) 

where ki = ki/\k\, so that "/^^pJ = k^tpj = 0. We will relate j^ipi with tpo 
and with k-tp,so that, after use of the field equations there are two degrees 
of freedom associated with the transverse part ipj , which correspond to 
helicity ±3/2; and two degrees of freedom associated with Yi’i (or V'o) 
which correspond to helicity ± 1 / 2 . 

For the helicity ±3/2 states we have to derive the equation for ipj . We 
apply decomposition (3.11) to the master equation (3.1) for /i = z and obtain 

± ^7° ± ma^ = 0- (3.12) 

In the limit of vanishing gravitino mass, the transverse part ipf is conformal 
with a weight -1-1/2. The transformation pjJ = reduces the equa- 

tion for the transverse part to the free Dirac equation with a time- varying 
mass term ma. It is well known how to treat this type of equations {e.g. 
see [15]). The essential part of 4'/’ is given by the time-dependent part of the 



5 We use now with i = 1,2,3 for the space components of -0^, while for gamma 
matrices 7 * are space components of fiat 7 “, and similarly for the 0 index. 
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eigenmode of the transversal component yTiif), which obeys second-order 
equation 

j/T + 2 /T = 0, (3.13) 

where the effective mass is fix = m{rj)a{ri). 

The corresponding equation for gravitino with helicity 1/2 is more com- 
plicated. We have to find and The equation for the components 

fc*')/'* can be obtained from the constraint equation (3.2): 



zk • Ip 



7o -I- zy • k — ma yVz- 

a 



(3.14) 



Combining all terms together, we obtain the on-shell decomposition for the 
longitudinal part 



V'i = V'?’ + ( fc* 7 • k -f - 7*k • 7) 



-70 ■ 



(3.15) 



Now we can derive an equation for From the zero component of (3.1) 

we have 



3a' Q / 3 , 

^7 V'o+I + 



ma 



tpo = (yV*)' - — yoyVi- 



(3.16) 



This equation does not contain the time derivative of z/o- Substituting z/o 
from (3.4) into (3.16), we get an equation for y'z/j 



where 



and 



{dr^ + B-ik- 77 qA^ 7 Vi = 0 , 


(3.17) 


i = - exp ^270 J , 


(3.18) 


- 3a' 5 ma „ 5 , 

S = - 70(1 + 3.4). 


(3.19) 



We can split the spinors 7 'z/ in eigenvectors of 70 , 7 'z/j = 9+ + 9-, and 
9± = i(l=Fz 7 o) 7 'z/i. From the Majorana condition it follows that 9±{k)* = 
^C9z^{—k), where C is the charge conjugation matrix. In a representation 
with diagonal 70 the components 9± correspond to the 70 -eigenvalues ±z. 
Acting on (3.17) with the hermitian conjugate operation gives us a second- 
order differential equation on the 9+. We choose for each k a spinor basis 
u\^ 2 {k) for the two components of 9+, and two independent solutions of the 
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second-order differential equations /i ^{k, rj). The general solution is given 

by 



0 + 

6 »_ 



a°'>^{k)fa{k,r])u0{k), 

Ct ,/?— 1 

2 

-C-i ^ a*^f^{-k)f*{-k,v)u*^{-k). 

a, 0—1 



(3.20) 



The last equality determines reality properties of the coefficients. Here we 
represented B as Bi + 70H2 and defined B = Bi + 1 B 2 , by analogy with 
the definitions for the matrix A. By the substitution fa{k,rj) = E{r])yi^{r]), 
with E = (— H*)^/^exp (^— drj Bi) , equation for the functions fa{k,r]) is 
reduced to the final oscillator-like equation for the time-dependent mode 
function yh(j])' 

Vl + {k^ + ~ *^l) 2 /l = 0. (3-21) 

Here 



a = fJ'— sin 2 / ydt + (1 — 3 cos 2 / ^dt 



1 



(3.22) 



In the derivation of (3.21) it was essential that A has the form (3.8). Finally, 
one can estimate the number density of gravitinos produced by the oscillat- 
ing scalar field in several inflationary models, and show that in some models 
the ratio may substantially exceed the bound n^i 2 ls < 0(10“^"^). 



4 Gravitino production 



Gravitino production occurs at the end of inflation, when the scalar field 4> 
rapidly rolls down toward the minimum of its effective potential V (yj)) and 
oscillates there. During this stage the vacuum fluctuations of the gravitino 
field are amplified, which corresponds to the gravitino production. 

Production of gravitinos with helicity 3/2 is described in terms of the 
mode function yT(jl)- This function obeys the equation (3.13) with Ht = 
ma, which is suppressed by Mp Non-adiabaticity of the effective mass 
Ht(? 7) results in the departure of yT(jl) from its positive frequency initial 
condition which can be interpreted as particle production. The theory 
of this effect is completely analogous to the theory of production of usual 
fermions of spin 1/2 and mass m [15]. Indeed, equation (3.13) coincides with 
the basic equation which was used in [15] for the investigation of production 
of Dirac fermions during preheating. 
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The description of production of gravitinos with helicity 1/2 is similar 
but somewhat more involved. The wave function of the helicity 1/2 grav- 
itino is a product of the factor E{r]) and the function yh{v)- The factor 
E(rj) does not depend on momenta and controls only the overall scaling 
of the solution. It is the function yh(j]) that controls particle production 
which occurs because of the non-adiabatic variations of the effective mass 
parameter The function yh(j]) obeys the equation (3.21) with the 

effective mass which is given by the superposition (3.22) of all three 

mass scales in the problem: fx, El and m. 

In different models of the inflation, different terms of fin will have dif- 
ferent impact on the helicity 1/2 gravitino production. The strongest effect 
usually comes from the largest mass scale /x, if it is varying with time. This 
makes the production of gravitinos of helicity 1/2 especially important. 

To fully appreciate this fact, one should note that if instead of consid- 
ering supergravity one would consider SUSY with the same superpotential, 
then the goldstino y (which is eaten by the gravitino in supergravity) would 
have the mass which coincides with y in the limit of large Mp, see 

equation (3.10). As a result, equation (3.21) describing creation of graviti- 
nos with helicity 1/2 at ^ <C Mp looks exactly as the equation describing 
creation of goldstinos in SUSY. That is why production of gravitinos with 
helicity 1/2 may be very efficient: in a certain sense it is not a gravitational 
effect. (On the other hand, the decay rate of gravitinos T ~ /Mp is very 
small because it is suppressed by the gravitational coupling Mp ^.) 

In order to understand the general picture, we will consider here a toy 
model with the superpotential W = -\/A<I)^/3, which at <C Mp leads 
to the effective potential A</>^/4. The oscillations of the scalar held near 
the minimum of this potential are described by elliptic cosine, 4>(j]) = 
^cn(-\/A(/o) -^)- The frequency of oscillations is 0.8472 -\/A(/>o and initial 
amplitude c/o — Mp [11]. 

The parameter /x for this model is given by ^ It rapidly changes 

in the interval between 0 and V2X4>o. Initially it is of the same order as H 
and m, but then H and m rapidly decrease as compared to y, and therefore 
the oscillations of y remain the main source of the gravitino production. In 
this case production of gravitinos with helicity 1/2 is much more efficient 
than that of helicity 3/2. 

The theory of production of gravitinos with helicity 1/2 in this model is 
similar to the theory of production of spin 1/2 fermions with mass \/2\(j) 
by the coherently oscillating scalar held in the theory A</>^/4. This theory 
has been investigated in [15]. The result can be formulated as follows. 
Even though the expression for U contains a small factor \/2\, one cannot 
use the perturbation expansion in A. This is because the frequency of the 
background held oscillations is also proportional to -\/A- Growth of fermionic 
modes (3.21) occurs in the non-perturbative regime of parametric excitation. 
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The modes get fully excited with occupation numbers Up ~ 1/2 within about 
ten oscillations of the field 4>, and the width of the parametric excitation of 
fermions in momentum space is about ■\/A</o- This leads to the following 
estimate for the energy density of created gravitinos, 

P3/2 ~ ~ AU((/o) , (4.1) 

and the number density of gravitinos 

ns/2 ~ (4.2) 

Now let us suppose that at some later moment reheating occurs and the 
energy density V{4>o) becomes transferred to the energy density of a hot 
gas of relativistic particles with temperature T ~ Then the total 

entropy of such particles will be s ~ ~ so that 

^ ~ ^ 10-10. (4.3) 

s 

This result violates the cosmological constraints on the abundance of grav- 
itinos with mass ~ 10^ GeV by 4 orders of magnitude. In this model the 
ratio does not depend on the time of thermalization, because both n ^/2 
and V{4>oY/^ decrease as a~^. To avoid this problem one may, for example, 
change the shape of V (c/>) at small (j), making it quadratic. 

In our investigation we studied only the models with one chiral multiplet. 
This is good enough to show that nonthermal gravitino production may 
indeed cause a serious problem, but much more work should be done in 
order to check whether the problem actually exists in realistic models with 
several different chiral and vector multiplets. 

First of all, one should write and solve a set of equations involving several 
multiplets. Even in the case of one multiplet it is extremely difficult, and the 
results which we obtained are very unexpected. The situation with many 
multiplets is even more involved. One possibility is to consider the limit 
Mp oo, since the most interesting effects should still exist in this limit. 

But this is not the only problem to be considered. In the toy models 
studied in this section with W = m^^'^/2 and W = -\/A<I)^/3, the superpo- 
tential W and the gravitino mass vanish in the minimum of the potential 
at (/ = 0. Then after the end of oscillations supersymmetry is restored, 
superhiggs effect does not occur and instead of massive gravitinos we have 
ordinary chiral fermions. In order to study production of gravitino with 
helicity 1/2 with nonvanishing mass m ~ 10^ GeV one must introduce ad- 
ditional terms in the superpotential, and make sure that these terms do not 
lead to a large vacuum energy density. 

Models with one chiral superfield which satisfy all of these requirements 
do exist. The simplest one is the Polonyi model with W = a{{2 — -\/3)Mp 
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+<I>). One can introduce various generalizations of this model. However, 
potentials in all models of this type that we were able to construct are 
much more complicated than the potentials of the toy models studied in 
this section. In particular, if one simply adds small terms a + /34)+ ... to the 
superpotentials ~ 4)^ or 4>^, one typically finds that V becomes negative 
in the minimum of the potential, which sometimes becomes shifted to the 
direction (j) 2 , where 4> = (^i + i(j)2)l\/2. This problem can be easily cured 
in realistic theories with many multiplets, which is another reason to study 
such models. 

It would be most important to verify, in the context of these models, 
validity of our observation that the probability of production of gravitinos 
of helicity 1/2 is not suppressed by the gravitational coupling. We have 
found, for example, that the ratio for the gravitinos with helicity 1/2 
in the model A</>^/4 is suppressed by This suppression is still rather 

strong because the coupling constant A is extremely small in this model, 
A ~ 10“^^. However, in such models as the hybrid inflation scenario all 
coupling constants typically are 0(10“^). If production of gravitinos in 
such models is suppressed only by powers of the coupling constants, one 
may need to take special precautions in order to avoid producing excessively 
large number of gravitinos during preheating. 

A more detailed description of the effects discussed above will be con- 
tained in the forthcoming paper [6]. 
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PHYSICS OF THE EARLY UNIVERSE: 
BARYOGENESIS: DEFECTS: INITIAL CONDITIONS 



N. Turok 



Abstract 

In these lectures I review some topics in early universe physics. First, 

I discuss the problem of the origin of the observed matter-antimatter 
asymmetry and the fascinating suggestion that it could have been 
generated at the electroweak scale, via physics we shall probe at accel- 
erators in the near future. Next I discuss the formation of topological 
defects at symmetry breaking phase transitions in the early universe. 

This is an almost unavoidable phenomenon in unified theories, and 
the idea that the defects later seeded structure in the universe was 
very appealing. Unfortunately, as I review, recent accurate computa- 
tions of the clustering of mass and the microwave anisotropies in these 
theories, combined with recent high precision data, have all but ruled 
the theories out. Finally, I turn to some recent work on the founda- 
tions of inflation, which seems at present our best bet for explaining 
the origin of structure in the universe. 

1 Introduction 

According to the hot Big Bang theory of the early universe we live inside 
the ultimate high energy accelerator. If we extrapolate back in time, the 
temperature of the universe grows without limit, so that ultra-high energy 
physics becomes relevant. Over the past quarter of a century many ideas 
have arisen as to how new physics at high energies could have impacted the 
Big Bang in its earliest moments. These ideas have been aimed as resolving 
some of the most profound cosmological puzzles - the nature of the dark 
matter, the origin of structure and the beginning of the universe. 

What is new and exciting about the current era is that speculative ideas 
about the earliest moments of the universe are becoming experimentally 
testable. Indeed high energy physicists are increasingly looking towards 
cosmology as a testing, and even a proving ground of the most fundamental 
theoretical ideas. 
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In these lectures the topics I cover follow a reverse historical order, 
starting with the comparatively well tested physics of 100 GeV relevant 
at 10“^° seconds after the hot Big Bang. I move on to much higher ener- 
gies, and the Grand unification scale of 10^® GeV relevant 10“®"* seconds 
after the hot Big Bang. These ideas are more speculative but are still in- 
teresting because of distinct observational signatures. Finally I discuss the 
topic of inflation and in particular of the state of the universe prior to infla- 
tion. I am personally excited about the recent progress on this topic, and 
hopeful it will lead us further in our search for a compelling inflationary 
theory. Again, even though the ideas are highly speculative, certain aspects 
of them turn out to be observationally testable. 

The electroweak theory is a crowning glory of twentieth century physics. 
It is a principled, unified account of a vast range of experimental phenom- 
ena. The theory is all but experimentally verified (the missing link being 
the Higgs boson), thanks to the heroic efforts of experimentalists at GERN, 
Fermilab and elsewhere. The main importance of the electroweak theory 
for cosmology is the possibility that the observed matter-antimatter asym- 
metry in the universe was produced at an electroweak phase transition. A 
truly beautiful complex of physics is involved coupling chirality, topology 
and anomalies with non-equilibrium phenomena. Unfortunately the mini- 
mal (one Higgs) standard model seems to fail for several reasons as I shall 
discuss. But simple extensions such as the supersymmetric version may 
work, and will be experimentally probed at the LHG in GERN, and the 
Tevatron at Fermilab over the next decade. There is an exciting possibility 
that the physics discovered there, including Higgs bosons and GP violating 
couplings, will in due course provide a compelling, quantitative account of 
the production of the matter-antimatter asymmetry and thence the elemen- 
tal abundances in the early universe. 

I move on to the cosmological impact of physics at much higher energies, 
at the scale of Unification of the strong and electroweak forces. Again there 
are fascinating phenomena related to phase transitions in which symmetries 
between fundamental forces and particles were broken. One of the most in- 
teresting possibilities is that cosmic defects were formed including strings or 
textures, which later survived as remnants of the Unification era. This idea 
has provided for some time an “honourable opposition” to the more popu- 
lar inflationary scenario for the origin of structure in the universe. Indeed 
the defect theories were more attractive to some of us than inflation be- 
cause they involved only one free parameter naturally related to the Grand 
Unification scale, and were more predictive than inflation. However recent 
high precision calculations and new observations seem to have definitively 
ruled out the simplest cosmic defects models. Whilst this is certainly disap- 
pointing for past advocates of the theory like myself, one must acknowledge 
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that it is a giant step in particle cosmology. It demonstrates that our ideas 
are testable, and the failure helps to point us in the right direction. 

The final topic of these lectures involves physics at the earliest mo- 
ments of the universe and how the universe itself might have begun. This 
is clearly a speculative endeavour. Inflation is the most successful scenario 
for the origin of structure in the universe that we have. But it has fun- 
damental shortcomings which need to be faced and resolved before it can 
really be called a “theory” in the traditional sense of theoretical physics. 
It is possible such a resolution awaits a full theory of quantum gravity, or 
that it is simply beyond us. But such doubts are no excuse for not thinking 
about the problem now. One way to make progress is to attempt to define 
inflation more precisely, in the hope that we find a compelling resolution of 
its fundamental problems. 

One of the key problems is the question of the initial conditions prior 
to inflation. All the predictions of inflation are to some extent contingent 
on ones assumptions about the initial conditions. There is an interesting 
proposal for this which seems to be mathematically at least somewhat better 
defined than the alternatives. This is the no boundary proposal due to 
Hartle and Hawking, and as I shall explain I think it is worth exploring. To 
an extent which has perhaps not been sufficiently appreciated, the Hartle- 
Hawking proposal offers a surprisingly well defined predictive framework. I 
describe our recent work aimed at developing the observational predictions 
of the no boundary proposal, for generic inflationary models. 

2 Electroweak baryogenesis 

Electroweak baryon number violation and baryogenesis are some of the most 
interesting phenomena in standard model high energy physics [1-7]. The 
physics involves topology, anomalies and nonequilibrium phenomena at the 
limits of current field theory techniques. In cosmology, the scenario of elec- 
troweak baryogenesis offers the prospect of the next great leap back in time, 
to 10“^° seconds after the hot Big Bang (for reviews see [8,9]). It could 
allow us to fix the one free parameter governing the abundances of the light 
elements in the universe, namely the baryon to photon ratio. 

There are many obstacles to making such a connection, not least because 
the only cosmological test is the single number we are trying to explain. Cur- 
rently, the only electroweak baryogenesis scenarios which work have several 
free parameters and the baryon asymmetry depends crucially on several of 
them. A compelling account will only be achieved if the relevant Higgs 
masses and couplings are experimentally measured. The most interesting 
candidate theory is the MSSM (minimal supersymmetric standard model). 
Here, the electroweak phase transition is only strong enough to allow baryo- 
genesis if some of the Higgs masses are not far above those currently being 
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probed at the LHC. But excluding the MSSM as the origin of matter is 
going to be much easier than confirming it. Confirmation will require the 
measurement of key CP violating phases which are needed to produce the 
asymmetry. These phases may be beyond the reach of currently planned 
particle physics experiments. Nevertheless, I believe we are justified in be- 
ing sanguine about the scenario, if only because the physics involved in this 
subject is quite compelling. For want of space I am not going to discuss 
the details of different models for electroweak baryogenesis. I shall attempt 
rather to give a glimpse of the underlying physics and a very partial account 
of how it might work. For a much more complete recent review see [10]. 

Hot Big Bang nucleosynthesis is one of the great successes of the stan- 
dard cosmology. The relative abundances of five stable light elements (H, 
D, Hea, He 4 and Liy) are correctly predicted with a single adjustable pa- 
rameter, which can be translated into the baryon-to-photon ratio in today’s 
universe. The observed light element abundances are consistent with the 
(conservative) range 



3 X 10-1° < 77 = — < 10"® (2.1) 

n-y 

which translate into the bounds O.Oll/i”® < Ub < 0.026/1“^ where Ub is 
the density of baryons in units of the critical density and 0.5 < h < 0.8 is 
the Hubble constant in units of 100 km s“i Mpc“i. The great observational 
advance of recent times has been the discovery of deuterium in high redshift 
hydrogen clouds. Some of the most impressive results are those of Buries and 
Tytler who have measured the deuterium to hydrogen abundance ratio to 
be 3.3±0.3 x 10“° in a cloud at redshift 3.6 [11]. In standard nucleosythesis 
this corresponds to 77 = 5.1 ± 0.3 x lO”!® and Ub = 0.019 ± 0.001. A large 
sample is needed to establish that the abundance is universal, but it is 
remarkable that accuracies of this order may soon be achieved. 

Within the standard approach to Big Bang nucleosynthesis, baryon num- 
ber is assumed to be constant and so in effect one is free to set rj by hand. 
It is then dialled to ~ 5 x lO”!® to fit the observations. This ugly feature 
prompted Sakharov to suggest an alternative, that baryon number was dy- 
namically generated in the universe prior to nucleosythesis. It would then 
be a number calculable from the fundamental laws of physics. He pointed 
out this requires three conditions: 

• B violation; 

• C and CP violation, since these operations take particles into antipar- 
ticles and thus reverse baryon number. Any theory respecting either 
symmetry would be bound to generate equal numbers of baryons and 
antibaryons; 




N. Turok: Physics of the Early Universe 



445 



• Departure from thermal equilibrium. This argument may be 
summarised as: 

Tr(e"^^B) = Tr{{CPT{CPT)-'^e-'^^ B) 

= Tr(e-^^(C'PT)-iBCPT) = -Tr(e-^'^B)(2.2) 

using the fact that in any local field theory the Hamiltonian commutes 
with CPT, and that B reverses sign under C, but not under P or T. 
It follows that B is on average zero in thermal equilibrium. 

There are possible loopholes in the last two arguments. Recall that the 
same sort of argument would exclude spontaneous symmetry breaking. It is 
wrong because in the infinite volume limit the Hilbert space decouples into 
non-communicating sectors within each of which the symmetry is broken. 
Mechanisms for breaking C and CP spontaneously are well known. One 
has a scalar field which transforms under C and CP and then gives it a 
symmetry breaking potential. Generally, domain walls would be formed 
associated with the symmetry breaking, which could be inflated away. As 
far as I know no-one has found a similar mechanism for baryon number to be 
broken spontaneously and then for our universe to be in a nonzero baryon 
number state. But it is certainly conceivable such a mechanism exists. It 
what follows I ignore this possibility. 

In the same year as Sakharov made his proposal (1967), the standard 
electroweak theory was invented. With the later addition of QCD there 
emerged the Lagrangian of the standard model - dropping summation over 
representation indices of SU{3) x SU{2) x U{1) and generations, this is 

£ = - - hcjiipilj + \'Dip\’^ - U((/?). (2.3) 

Twenty years later the arguments of Kuzmin, Rubakov and Shaposhnikov 
(prefigured in work by Dimopoulos and Susskind [3], and Klinkhamer and 
Manton [4]) that B violation could be important in theory (2.3) at high tem- 
perature opened the possibility that Sakharov’s conditions might actually 
be met. 

B violation occurs in a very beautiful way in the standard model, as a 
result of two facts. First, the electroweak theory is chiral, the SU{2) gauge 
fields only coupling to left handed fermion fields. This leads to an anomaly 
in the baryon number current, first noted by ’t Hooft in 1976 [1]. Second, 
and equally crucial, the topology of the vacuum manifold Ado (the set of 
minima of the Higgs potential) has a nontrivial third homotopy group tto 
( since A4 q = S^)- This allows for “permanent” changes of the baryon 
number of the universe if one begins with zero net baryon number and then 
twists the gauge and Higgs fields up in the appropriate way. 

C and CP are violated in the standard model in a straightforward way, 
the former maximally (no left handed antineutrinos) . The latter is violated 




446 



The Primordial Universe 



by the complex phase which enters in the h term (the Kobayashi-Maskawa 
matrix) when there are more than two generations of quarks and leptons 
(for a nice review see [12]). 

For a long time, it was argued that in standard model there would be 
a phase transition in which the SU (2) gauge symmetry was broken. This 
would be a very natural place to make the baryon asymmetry if the phase 
transition were first order since then there would be supercooling and bubble 
nucleation and growth. These would produce dramatic departures from 
thermal equilibrium, satisfying Sakharov’s final condition. However more 
recent work has shown that whilst this picture is indeed correct at light 
Higgs and top quark masses, for the experimentally allowed values there is 
no phase transition at all [13]. 

One argument for a phase transition is that if the light degrees of freedom 
change discontinuously as the system cools one might expect discontinuities 
in the bulk properties such as the specific heat. At first glance this appears 
to be true in the standard model. Namely, at high temperatures (and 
for a continuous range of temperature) the Higgs field vev is zero and the 
physical, transverse gauge field degrees of freedom are massless. However 
at low temperatures the Higgs field turns on and the transverse gauge fields 
aquire a mass. One therefore has a parameter which is not analytic in the 
temperature, and could therefore expect this nonanalyticity to show up in 
properties of the system. 

This naive argument is certainly wrong, as emphasised by Kajantie 
et al. [13]. The point is that in the high temperature phase, in a non- 
abelian theory the transverse gauge fields aquire a finite temperature mass 
(the “magnetic” mass) due to their carrying colour charge. In an abelian 
theory this does not happen because photons carry no U(l) charge. Thus 
in a nonabelian theory at high temperature all degrees of freedom are mas- 
sive and the above argument for nonanalyticity fails. Kajantie et al. found 
through a combination of analytic methods and lattice simulations that in 
the experimentally allowed range for Higgs and top masses there is a smooth 
crossover from the high temperature to the low temperature phase. 

Nevertheless the perturbative calculations have also been checked to be 
valid at low Higgs and top masses. They are still very useful in broad ranges 
of parameter space in extensions of the standard model such as the MSSM. 
Calculations here show that for show that a sufficiently strong first order 
phase transition may happen as long as the standard Higgs has a mass 
below 115 GeV [14]. This is not far above the experimental lower bound 
from LEP of 95 GeV. (The range up to 110 GeV will be explored next 
year). The phase transition depends critically on the stop mass, which has 
to be between 100 GeV and the top mass. A large range of this parameter 
space will also be explored by the Tevatron at Fermilab in the next couple 
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of years. So there are excellent prospects at least of ruling out the MSSM 
as the origin of the baryon asymmetry. 

After this review of where we stand, let us turn to how Sakharov’s condi- 
tions are met in the standard model and minimal extensions thereof. Rather 
than give you a detailed account of the models I am going to concentrate 
on the central physical mechanisms, which are certainly the beautiful part 
of the subject. 

3 B violation in the standard model 

Baryon number is violated in the standard model because of chirality and 
topology. Rather than study the full blown 3-1-1 theory straightaway, I find 
it very helpful to first visualise things in a much simpler 1-1-1 analog theory. 
The analog theory is just the Abelian Higgs Model (AHM), described by the 
same Lagrangian as (2.3), but with with a U{1) gauge group and “sombrero” 
symmetry breaking potential, with minimum A4 q = as illustrated in 
Figure 4b. 

The topology is made clearer by compactifying space, assuming the uni- 
verse is actually a circle of length L with periodic boundary conditions on 
the Higgs and gauge fields. There are then two winding numbers one can 
define 

A^h = ^ y dxdxa Afcs ^ J (3.1) 

where a is the phase of the Higgs field [tp = 0e“). The Higgs field winding 
number A/"h is genuinely topological, measuring the number of times p winds 
around the origin as one traverses the universe {i.e. the spatial circle). 
However A/r is undefined if the Higgs field vanishes somewhere. This leads 
to the possibility that even in a smooth time evolution of the fields, A/h 
can jump by an integer. In contrast the gauge field winding number A/cs 
is always well defined, but is only necessarily an integer if the gauge field 
is pure gauge. If igA^ = dxU {x)U~^ {x) with U{x) a U{1) valued group 
element depending in a single valued manner on x, U = with 6{x + 

L) = 9{x) + 2 ttN and N integral, then Mcs = N. 

The classical vacua of the theory are labelled by the number of times 
the Higgs field winds around the potential minimum. In each vacuum, in 
order for the Higgs field gradient energy to be minimised, one must have 
igAx = —dxa and thus A/cs = — A/r = N, an integer. These states are 
called the ‘W-vacua” of the theory. 

The A^-vacua are all equivalent under large gauge transformations, ones 
in which the phase wraps some nonzero number of times as one passes 
around the spatial circle. One might be forgiven therefore for thinking the 
huge vacuum degeneracy was just a remnant of the gauge symmetry and 
should be ignored. This is not so because transitions between the different 
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Fig. 1. Energy of the minimal energy configuration at each Chern Simons nnmber 
for the Abelian Higgs model. The maximnm is at half-integral Chern Simons 
number. 



IV-vacua cannot be gauged away. Such transitions occur if the Higgs field 
changes winding number, with the gauge field naturally tagging along in 
order to minimise the energy. Such a transition requires an energy input to 
lift the Higgs field off A4 q and to create the necessary electric fields: one 
sees that dtAfcs oc f dx£ with £ the electric field. (The term in d^At is 
zero due to the periodicity.) 

What is the minimal energy input needed to change JV by one unit? 
Consider a continuous path from one JV vacuum to the next. For each value 
of Afcs modulo integers there is a configuration of least energy. Parity 
invariance of the scalar-Higgs equations implues that the energy is odd 
under JVcs —Afcs- This combined with symmetry under Afcs A£cs + 1 
from “large” gauge invariance establishes that if there is a single maximum 
of the energy it must occur at Afcs = f • But at this point, the minimal 
energy configuration is actually a stationary point of the energy under all 
variations and is therefore a classical solution. It was called a “sphaleron” 
solution by Klinkhamer and Manton, connoting the fact that it is unstable 
to variations which alter Afcs, and under which it can “fall off the hill”. 

The sphaleron barrier is almost insuperable in the present ground state 
of the electroweak theory. The sphaleron energy is on the order 10 TeV, and 
going over it requires a long wavelength coherent excitation of the Higgs and 
gauge fileds. However at high temperature, the long wavelength bosonic field 
modes have large occupation numbers and the energy input needed to go 
over the “sphaleron barrier” is available from thermal fluctuations. Early 
simulations by Grigoriev and collaborators [15] in the 1-1-1 dimensional 
AHM provided important confirmation of this point. 

It is intriguing enough to conceive of changing the winding number of 
the Higgs and gauge fields in the vacuum of our universe by an input of 
energy. But the physics becomes even more interesting when one includes 
fermions. Initially, suppose they are massless (h = 0 in (2.3)), but have 
gauge coupling gp so that on the fermions = d^ + igFAfj,. There appears 
to be a global axial symmetry if' = but the corresponding Noether 
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current has an anomaly: 

« = ( 3 - 2 ) 

which means that the number of left or right handed particles is not con- 
served. The right hand side is actually proportional to the divergence of a 
current A^, = d^j^g, where j^g is the Chern Simons current. 

Equation (3.2) has a beautiful explanation in terms of spectral flow, first 
understood by Lieb and others in condensed matter physics (see [21] for a 
review). It is simplest to work in the Aq = 0 gauge, in which the 1-1-1 Dirac 
equation is 

idotp = (3.3) 

where ip is a, two component Dirac spinor, and 7 sV’r,l = ±V’R,l- Let us 
consider the fermions in the presence of a classical background gauge fields, 
that is ignore the back-reaction of the fermions on the gauge fields. We 
shall be interested in turning on an electric field. This could be done by 
pair creating two charges and then moving them apart. The field then points 
from the positive to the negative charge along the shorter arc on the circle. 
By creating a large number of pairs, and annihilating nearest neighbour 
positive and negative charges we can arrange to turn on a uniform electric 
field of arbitrary strength in our one dimensional “universe” . Of course 
in the situation of interest, the electric field will actually be the produced 
by the Higgs field current associated with the the transition between two 
fV-vacua. 

In any case assume we have turned on a small electric field, dtA^ which 
in the absence of charges must by Gauss’s law be spatially uniform. But 
if Ax is constant, we can make the usual ansatz ip ~ Q^iEt-px) 
p constant. We then read off from (3.3) the dispersion relations for the 
right-movers (R) and left-movers (L), 

E = ±{p + gFAx) 75 = ±1. (3.4) 

As one turns on an electric field and changes Ax(t), the canonical momentum 
of a given state {p = {piim) j with L the box size and n an integer) is 
unchanged {p is quantised and therefore forced to stay constant) but the 
energy associated with that state changes. In particular as the Chern- 
Simons number changes by precisely one unit (which would if we move 
our hypothetical source charges around the circle to annihilate), the energy 
levels change by <5 El,r = =f(27t/L). That is, the L states move down by 
one quantum and the R states move up by one quantum. If we begin in 
a state which has all the negative energy levels filled (the “Dirac sea”), 
then we end in a state which has one R particle and one L hole, both at 
very small momentum. The change in the number of right handed minus 
left handed particles (relative to the filled “Dirac sea”) is then seen to be 
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^(-^R — -^l) = 2AA/cs, which is exactly what the spacetime integrated 
version of (3.2) implies. 

This creation of particles from the vacuum only works because of a cru- 
cial subtlety. The theory is regularised by disregarding the particles in 
the “Dirac sea” at infinite negative energy. Thus particles may be “fished 
out” or “dumped into” the Dirac sea, in violation of the naive Noether 
conservation law. Thus anomalies connect very high energy behaviour (reg- 
ularisation) to very low energy behaviour (creation of massless particles at 
essentially zero momentum). 

This clash is also at the heart of why anomalies are potentially ru- 
inous to the consistency of the quantum theory. If the particles created 
via an anomalous process carry nonzero electric charge, the process violates 
Gauss’s law and would render Maxwell’s equations inconsistent. In other 
words, if we give left and right handed components of our i/; field gauge 
charges (/l and so that the gauge current is = 5lV’l7'^V’l+5rV’r 7^'0R> 
from (3.2) it follows that the electromagnetic current is conserved (and 
therefore Maxwell’s equation = j'' consistent) only if (/£ = g^. In 

the vectorlike theory, with (/l = ffR> zero net particle number is produced. 
But in the axial theory, with = ~3 r, each change in n produces two parti- 
cles with opposite electric charge. Note also that in this case the anomalous 
current involved is “vectorlike”, and so no mass term appears in the 

anomaly equation. 

Before passing to the higher dimensional case, note that the energy 
of the system may be calculated as a function of left and right handed 
fermion number. At first sight, the energy appears not to be changed in 
one transition, as the right handed levels move up by exactly the same 
amount as the left handed levels move down. However when one computes 
the regularised energy one finds that there is a term involving {Afcs ~ , 

so that the symmetric state in which right and left handed filled levels are 
at identical energies, is the minimal energy state [21]. 

In 3 -|- 1 dimensions one may compactify space by identifying fields at 
infinity so that space becomes an S^. For the standard model, ignoring 
hypercharge one has the gauge group SU{2) = and the space of minima 
of the Higgs potential is also an S^. Thus with the substitution of for S^, 
one has the precise analogue of the 1-1-1 Abelian Higgs model discussed 
above. There is also a level crossing explanation of the anomaly. Consider 
a massless chiral fermion in a constant background magnetic field Bz which 
we can represent via Ay = —B^x. Again in Aq = 0 gauge the Dirac equation 
reads 

idtip = — 7®fS.I?'0 (3.5) 

where S is the spin matrix and T> the covariant derivative. In a chiral 
basis, we express the Dirac spinor ip = (ipB., V’l) where y^V’R.L = i'0R,L and 
S = (T, the Pauli matrices, on both. Now the transverse Dirac operator has 
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a number of zero modes 



f 0 doc -idy - gBzX 

9a; + idy + gBzX 0 



^>0 = 0 



(3.6) 



where 

^ ^ Exp [-^Booix - ky/gBoof + ikyy) 

labelled by ky. These are the lowest Landau levels. In these energy levels the 
spin is aligned with the magnetic field, and the longitudinal Dirac equation 
is precisely the same as the 1 + 1 dimensional Dirac equation we studied 
above! The number of transverse zero modes is seen as follows. The centre 
of the Gaussian in a: is a; = ky/gB^. As we increase ky = 2 ttN/L the 
centre runs along in the x direction. We cannot go further than x = L, the 
transverse size of the system, before repeating, so the maximum value of N 
is given by 2nN/L{gBz)~^ = L, or N = gB^L^ /2n. This is the density of 
states at fixed Bz- From the one dimensional argument, we infer that as we 
switch on an electric field in the z direction, Ez = dtAz we have 




A(1Vr — IVl) 



gBzL^ , {EzTgL) 
2tt 2tt 



(3.8) 



where the last factor is the change in Az as if we leave the electric field on for 
a time T. But the right hand side is just the time integral of the anomaly 
f d^xE.B, and the above represents a simple derivation of the anomaly 
equation, which in 3+1 dimensions and for a nonabelian or abelian theory 
reads 

= £^Tr(E;,En (3.9) 

where the trace is evaluated in the representation the fermions are in. Note 
that E*^F>^’' = 4E.B. 

In the standard model the gauge currents are all anomaly free and thus 
the gauge field equations are rendered consistent, through a series of miracu- 
lous cancellations. However baryon number current and lepton number cur- 
rents are anomalous, as can be seen by writing = iSF_coi9R7^9R+9L7^9L 
where = ^(li7^)9 and the sum runs over families E and colour. Only 
the left handed quarks couple to the SU{2) gauge fields and one obtains 
from (3.9) is 

(3.10) 

where Np is the number of generations i.e. three. is the SU{2) gauge 
field strength and its dual. The 1/3 in the baryon 

number current is cancelled by the sum over colours. The generators are 
normalised to obey Tr(T“T*') = \5°'^ . I have ignored a (nontopological) 
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hypercharge term here. It can only lead to “temporary” violations of baryon 
and lepton number because there are no A'^-vacua for a C/(l) theory in three 
dimensions. But for the SU{2) gauge fields, there is a winding number 
measuring the number of times a gauge group element traverses SU{2) = 
as one traverses three dimensional space with infinity identified, which is 
topologically S^. The spacetime integral of (3.2) implies that a change in the 
Chern-Simons number to one unit creates three baryons and three leptons. 
Just as in one dimension, the right hand side of (3.10) is the divergence 
of a current, and the integral of the time component of this current is the 
Chern-Simons number. Note that as in the one dimensional example, there 
is no fermion mass term in (3.10) since and vectorlike. 

The rate of change of baryon number is calculated from (3.10) to be 
proportional to the space integral of £.B. This is zero on average in thermal 
equilibrium, from time reversal invariance. But nevertheless, it has a de- 
tectable effect. For a pure gauge-Higgs system, in which there is no energy 
cost in making sphaleron transitions, £.B behaves like a random driving 
term, and the Chern-Simons number performs a random walk, with the 
Chern Simons density ncs oc Tst, t being the time and Fg a rate per unit 
volume per unit time. This rate was first estimated semiclassically in the 
broken phase, with the result that Fg (x with the sphaleron energy 

Eg = (1.5 — 2.7) for Higgs self coupling ranging from zero to infinity [4]. 

A great deal of work has since been done to understand the rate to higher 
accuracy in the broken phase, and especially in the unbroken phase where 
it is not exponentially suppressed and for which a semiclassical estimate is 
not possible. Inded a whole range of new techniques for studying real time 
phenomena at finite temperature has been developed, which are likely to 
have spinoff into other fields (for a nice recent review see [16]). 

Initially, simulations of the classical field theory performed by Ambjorn 
and collaborators found that 



Fg = k(02T)4 (3.11) 

K ~ 1.0 [7]. This was in agreement with naive expectations that the rate 
would be determined by the magnetic mass (mentioned above) in the un- 
broken phase, Miviag ~ a 2 T. However more detailed reflections indicated 
that the Debye mass should enter in the expression [17]. Recently more 
sophisticated simulations using Langevin methods and careful matching of 
the lattice and continuum theories yielded the result 

Fg = (20 - 25)a^T‘^ (3.12) 

which is (apparently fortuitously), for the measured value 02 ~ in re- 
markably close agreement with the original simulations of Ambjorn et al. 
So whilst the parametric dependence on 02 of (3.11) was not correct, the 
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numerical value was! In any case, the dimensionless rate of baryon number 
violation is at least in the ballpark of the values of rj that we seek to explain. 

The linear growth of Afcs with time only occurs if one neglects the free 
energy cost associated with producing chiral particle number. As in the 
one dimensional example, this is proportional to the square of the particle 
number produced (or its chemical potential). The complete picture of en- 
ergy or free energy versus Afcs is therefore not that shown in (1) but rather 
the same graph with a small quadratic term added, which is minimised in 
the vacuum state where there is zero net chiral particle number (or baryon 
number). 



4 The electroweak phase transition or crossover 



Another fascinating coincidence, needed for Sakharov’s third condition, is 
that a first order transition is possible and indeed generic for spontaneously 
broken gauge theories at weak coupling. The key effect is attributed in 
condensed matter physics to Halperin, Lubensky and Ma and is relevant in 
superconductors. Consider a Higgs field coupled to bosons and fermions via 
mass terms, as in the standard model. In the one loop approximation, the 
standard formula for the free energy is: 

T=V{<f)+T E ±/ = (4.1) 

species ' 



with V {4>) the zero temperature potential and the ± signs holding for bosons 
and fermions respectively. We ignore the Higgs excitations: at weak Higgs 
self-coupling their masses are small and their effect is suppressed. The 
second term in (4.1) may be thought of as due to the pressure of particle 
collisions on an interface between a region where 4> 0 and one where 4> = 0. 

Clearly the pressure from a gas of massless particles at temperature T is 
greater than that from a massive particles because there are more massless 
particles and they travel faster and therefore collide with the interface at a 
higher rate. So while at low temperatures the zero temperature potential 
V{(j)) favors the broken symmetry phase by giving it a higher pressure (recall 
the relation P = —P), at high temperature the excitations instead give the 
the unbroken phase higher pressure. This is the physical reason behind 
symmetry restoration in spontaneously broken field theories. 

For bosonic excitations getting a mass from (j), like massive gauge fields, 
a remarkable thing happens, namely in spite of (4.1) being apparently a 
function of m^, its expansion in in powers of m/T contains a negative cubic 
term: 



P « U(0) -f 



E 

bosons 



rri^T^ 

24 



m^T 

127T 



...T E 

fermions 



m'^T 



2 



24 



(4.2) 
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where the ... terms are of order m'^. The reason for the apparent 
contradiction is that the Bose-Einstein distribution function is singular at 
low energy, and in effect an infrared divergence is cut off by the particle 
mass TO. Since the excitation mass m (x 4> the negative cubic term creates a 
similar negative cubic term in !F{(j>,T), and it is easy to see that this leads 
to the appearance of a “false vacuum” in the free energy as the system cools 
from high temperature. The system is then forced to jump from the false 
to the true vacuum, and this occurs via the nucleation of bubbles. A rough 
criterion for the validity of the one loop calculation is that the Higgs quartic 
self-coupling A parametrically obeys g"* <C A <C 5^, the first condition being 
from the fact that the m^logTO^ term coming from the gauge fields should 
not outweigh the tree level quartic term. The second condition allows one 
to argue that the contribution from Higgs excitations should be negligible 
compared to that from the gauge bosons. 

By now the perturbative and nonperturbative calculation of the elec- 
troweak theory free energy at high temperatures has been carried out in 
enormous detail. Perturbative calculations have been performed in the 
MSSM, to two loops, leading to the constraints and bounds mentioned 
above [14]. 

5 Baryon production 

A first order phase transition produces supercooling and strong departures 
from thermal equilibrium. The bubble nucleation rate is exponentially small 
at weak coupling and for light Higgs masses one finds the bubbles nucleate 
and grow at a substantial fraction of the speed of light. The bubble walls 
induce strong departures from thermal equilibrium, which is a good start 
for producing a nonzero baryon number. 

Calculations of the produced baryon asymmetry are difficult, and have 
generated some controversy in the field. The framework I shall briefly de- 
scribe below involves using a classical Boltzmann transport equation for the 
particle phase space density. These ideas were applied in [24, 36-38] . 

6 Two- Higgs baryogenesis 

How does one create a baryon asymmetry from the anomaly (3.10)? Inte- 
grating that equation one finds 

i? oc y d^xS.B. (6.1) 

It is clear that to generate an asymmetry S.B must therefore be biased. The 
idea that bubble walls could be involved in this biading was first suggested 
in an extension of the standard model involving explicit lepton number 
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violation [28]. Later it was realised that there was a biasing mechanism 
at work in even the most minimal extension of the standard model, the 
two Higgs theory [29]. A further development was the realisation that 
there exist important transport mechanisms [34, 40] which can transport 
certain approximately conserved global charges from the vicinity of the wall 
where they are produced into the unbroken phase where B violation is most 
efficient. 

In a two Higgs theory, there is a Higgs scalar which is odd under C and 
CP, which is the phase of the product of the two Higgs fields: ^p\^p 2 = Re*^. 
On bubble walls, as the vevs of ip\ and (p 2 make the transition from zero in 
the unbroken phase to their nonzero broken phase values, there is a favored 
path for the phase 9. As a bubble wall passes a given point in space, an 
obsever at rest in the plasma frame would see a nonzero value of 9, the sign 
and magnitude being determined by parameters in the Higgs potential. As 
argued above, for anomaly-driven baryogenesis one not only needs the usual 
C and CP violation, but also P violation. Now the classical equations of 
motion of the gauge and Higgs fields are P invariant, and so cannot drive 
the anomaly. The observation Zadrozny and I made was that if you inte- 
grate out the fermions (whose couplings do violate P), then you get from 
a one loop triangle diagram a term proportional to f C^x9F^*F^^°‘ in the 
effective action for the gauge and Higgs fields. Upon integrating by parts 
in time, one sees this is equivalent to f dt9Afcs, *-e- a linear potential for 
the Chern-Simons number, which one might expect drives it in a definite 
direction. Numerical simulations of the full classical field equations con- 
firmed this interpretation [29] . More detailed investigations have since been 
performed [30]. 

A development of this scenario came when the analogous term in the 
effective action was computed at finite temperature by McLerran et al. [31]. 
This gave the result 



7C(3)/mt\2 g 1 
4 Wr / IGtt^ Uq 






( 6 . 2 ) 



which is suppressed compared to the naive zero temperature result by the 
Matsubara factor ■ Physically, this is due to the fact that the effect 
results from integrating out virtual fermions, but at finite temperature these 
states are already occupied, and the Pauli exclusion principle prevents the 
virtual fermions propagating. 

The classical equations following from introducing (6.2) into the gauge- 
higgs action are perfectly well defined, and may be followed in a real time 
computer simulation. This was actually done by Grigoriev et al. in the 1-1-1 
dimensional “two-Higgs” theory, where a parity violating term analogous 
to (6.2) also exists. The simulations involved a phase transition “mocked 
up” by simply changing the Higgs potential by hand after equilibrating the 
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system in an unbroken phase. They did indeed show a baryon asymmetry 
being produced [32]. 

Using the classical field equations, and averaging nonlinear terms in a 
judicious manner, one can estimate the final baryon asymmetry produced 
from this mechanism, and the answer is [8,31] 



ub 

s 



/ TD-t \ ^ 1 

V ttT ) g* TT^ 



X Ocp 



10 ^°6*cp 



(6.3) 



where 9cp is a measure of CP violation in the Higgs potential, which may be 
of order unity. Various assumptions were made in this derivation, spelled out 
in [8], and it seems clear that this estimate could be numerically wrong by a 
factor of 10^^. The reason for each factor is however clear. The first factor 
is Matsubara/Pauli suppression. The second is the sphaleron rate constant 
divided by the number of particle degrees of freedom at the electroweak 
scale, g^, ~ 100. Third there is a because the effect occurs at 1 loop, and 
finally there are the CP violating parameters in the Higgs potential. The 
result is comparable to what is needed if 9qp is large, but measurements 
of the neutron electric dipole moment typically constrain these parameters 
to be smaller than ~ 10“^. So this mechanism is at best on the edge of 
viability. 



7 Baryogenesis from transport 

The mechanism described above is local in the sense that it involved a local 
term in the gauge-higgs equation of motion. An important observation was 
made by Cohen et al. that nonlocal effects could significantly enhance the 
result [34] . Their suggestion was that it might be more efficient to transport 
the C violation from the bubble wall into the unbroken phase far in front 
of it, because in that phase B violation is unsuppressed. In particular, they 
showed that if the C violating field 9 mentioned above varies across the 
bubble wall, then the quantum scattering amplitudes for particles off the 
bubble wall are different for left and right handed particles. This means 
that an excess of left handed over right handed particles (or vice versa) is 
injected into the medium in front of the bubble wall. Such an excess in 
principle can last for quite a time before the wall catches up - in a time t 
a particle diffuses a distance x ~ y/lJi with D the diffusion constant. The 
wall catches up when v^rt = x, v^/ being the wall velocity. So t ~ D/v'^, 
which may be quite long if D is big and small. And during the time this 
chiral excess exists, it drives the anomalous processes (which act to try and 
restore chiral neutrality). 

Another point emphasised by Cohen et al. was that if one is interested 
only in the biasing effects which an “injected” chiral excess of fermions have 
in the unbroken phase, it is possible to compute this effect with no knowledge 
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of the details of the gauge and Higgs field dynamics other than the rate 
constant Pg, using the following general argument. We can describe a state 
of nonzero baryon number using a chemical potential ^b- In the presence of 
this chemical potential, the free energy cost of producing a baryon (treated 
as massless) is just /tb (since B oc fiB, the free energy of a state with baryon 
number B is proportional to B^, minimised at H = 0. Now consider the rate 
for the system at baryon number B to make a transition to baryon number 
B + 1. We call this P+. Conversely the rate for B to decrease by 1 is called 
r_. Detailed balance tells us that in equilibrium T+p{B) = T-p{B + 1) 
(using classical statistics), where p{B) is the probability of having baryon 
number B. Since p{B + 1) = p(H)e“^'^/^, we deduce the ratio of P+ to P_. 
The second fact we can use is that by symmetry, we must have P+(/xb) = 
r_(— ^b)- These two equations then uniquely fix P± « (1 T at 

small pb ■ Following this logic through for a system with arbitrary chemical 
potentials for each fermion species, one finds 

Y 

^ T Mel T C^-I) 

families 



which for masssless fermions is just proportional to left handed fermion 
number. This makes sense: the anomaly responds to all left handed fermions 
equally, since all couple to SU{2)b in the same way. 

This discussion makes it clear that what one needs in order to drive 
baryon number violation is then an excess of left handed fermion number. 
As Cohen et al. pointed out, that may be produced by the interactions of 
fermions with the walls. 



8 Classical force baryogenesis 

I am going to describe the approach developed by Joyce, Prokopec and 
myself on baryogenesis from a classical force, which allows one to include 
all the different proposed effects including transport and one loop biasing. 
This work is reported in detail in three recent papers [36-38], and I shall 
only attempt to highlight some points here. Our motivations are twofold: 
classical effects are more likely to survive at high temperature, and for thick 
bubble walls, where the WKB approximation is likely to be good. Second, 
nonequilibrium effects are in principle straightforwardly calculable from the 
classical Boltzmann transport equation. 

Let us begin by defining the physical Higgs/gauge degrees of freedom 
which shall be involved in the CP violation on bubble walls. If we diagonalise 
the Higgs kinetic terms in the two-Higgs theory (taking ipi = -^{0, uie*^^)), 
(fi 2 = -^(0, as usual and ignoring fluctuations in vi, V 2 , as well as 




458 



The Primordial Universe 



the charged fields (see below) we get 









I7 _ 

2 ^ 



vld^ei+vld^92 



n 2 



( 8 . 1 ) 



where v"^ = v\ + v\ and = g\ + g\- is the usual (gauge variant) 
expression in terms of and B^. Both 9 and are gauge invariant, 
and we take them as our definition of the CP violating condensate field and 
the Z field respectively. 

Assume for simplicity that only ipi couples to the fermions via Yukawa 
terms. The phase of the Higgs field 9i can be removed by performing a 
(or hypercharge Y - the two are equivalent since Q = 2^^ + ^ is an unbro- 
ken exact symmetry) rotation on the fermions, at the cost of introducing 
a coupling d^9\T^, i.e. a pure gauge potential for into the fermion ki- 
netic terms. Let us see what effect this has. We may combine the d^9iT^ 
potential with the coupling to the field to find 



igAZ^^^)il) - rmpil) ( 8 . 2 ) 



where gA = +5/4 for up-type quarks, and gA = —5/4 for down-type quarks 
and leptons, and 

gAZ^ = gAZ^^ - (8.3) 

where gA = \g- We have dropped the vector coupling for the following 
reason. We treat the wall as planar, and assume it has reached a static 
configuration in the wall rest frame, with the background scalar fields being 
functions only of z, and the field Z^ = (0, 0, 0, Z(z)) being pure gauge. We 
can then gauge away the vector term. (This puts some 9 dependence into 
Higgs-fermion interaction terms, whose effect we shall discuss below.) The 
axial term cannot be gauged away, and has a real physical effect. 

The WKB approximation to the dynamics of particles in the background 
of the bubble wall is good provided the length scale on which this back- 
ground varies is long in comparison to the de Broglie wavelength of the 
typical thermal particles we wish to describe. This is simply the require- 
ment that the thickness of the bubble walls L be greater than T~^ . As we 
have indicated above, this is a very reasonable expectation. 

To describe the WKB “particles” we turn to the Dirac equation de- 
rived from the above Lagrangian. The dispersion relation is obtained as 
follows. In the rest frame of the bubble wall we assume that the field 
Z^i = (0, 0, 0, Z(z)), and we can boost to a frame in which the momentum 
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perpendicular to z is zero. In this frame the Dirac equation reads (after 
multiplying through by 7 °) 

ids'll) = + m)->p - gAZT.^-tp. (8.4) 

Setting Ip ~ Q-^Et+ip^z ^ ggg energy is given by the usual ex- 

pression for a massive fermion plus a spin dependent correction. The eigen- 
spinors are just the the usual free Dirac spinors. 

Returning to the p± yf 0 frame amounts to replacing E with — p^, 
from which we find the general dispersion relation in the wall frame, 

1 

= ±1 (8.5) 



E = 



pI 






where is proportional to the spin Sz as measured in the frame where p± 
vanishes. The same dispersion relation holds for antiparticles. The particles 
we are most interested in for baryogenesis are left handed particles {e.g. tn) 
and right handed antiparticles (tn), since these couple to the chiral anomaly. 
Note that they couple oppositely to the Z field. 

In the WKB approximation we take each particle to be a wavepacket 
labelled by canonical energy and momentum {E = p^,p) and position x. To 
compute the particle’s trajectory we first calculate the group velocity 



Vt = Xi = dp^E ( 8 . 6 ) 

and second, using conservation of energy E = XidiE + pidp^E = 0, we find 

P^ = -d,E. (8.7) 



Together these constitute Hamilton’s equations. The momentum of the 
particle is not a gauge invariant quantity, but the particle worldline certainly 
is, and we can for example calculate the acceleration: 



dvz _ 1 (m^)' {gAZm^y ^ 

dt 2 E2 E^y/{E^-pj) 



( 8 . 8 ) 



where of course E and p± are constants of motion. The first term describes 
the effect of the force due to the particle mass turning on, the second the 
chiral force. In the massless limit the latter vanishes, as it should because 
in this case the chiral gauge field can be gauged away. 

We now seek to describe the particle excitations with dispersion rela- 
tions (8.5) as classical fluids. In our work on classical baryogenesis [36] we 
have focussed on particles with large jp^j ~ T >> m for three reasons: they 
dominate phase space, the WKB approximation is valid, and the disper- 
sion relation simplifies so one can identify approximate chiral eigenstates. 
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Fig. 2. The motion of classical particles in the background of a bubble wall with 
CP violating condensate. Particles with positive spin S 3 perpendicular to the wall 
see an effective barrier shifted to the right: particles with negative spin see the 
opposite effect. 



The Sz = +^, Pz < 0 branch, and the Sz = ~^,Pz > 0 branch consti- 
tute one, approximately left handed fluid L, and the other two branches an 
approximately right handed fluid R. 

The Boltzmann equation is: 



dtf = dtf + zdzf + PzdpJ = -C{f) 



(8.9) 



where z and Pz are calculated as above, and C(/) is the collision integral. 

This can in principle be solved fully. However to make it analytically 
tractable we truncate it with a fluid approximation which we now discuss. 
When a collision rate is large, the collision integral forces the distribution 
functions towards the local equilibrium form 



/ = 



1 



Qp[-y(E-vpz)-fJ.] _|_ 



( 8 . 10 ) 



where T = v and p are functions of z and t, and 7 = 1/(1 — 

These parametrise the fluid velocity v, number density n and energy den- 
sity p. We treat the approximately left-handed excitations L and their 
antiparticles L as two fluids, making an Ansatz of the form (8.10) for 
each. The fluid equations for particle minus antiparticle perturbations 
{5T = 5T{L) — 5T{L), p = p{L) — p{L), Sv = 5v{L) — 5v{L)) are, in 
the rest frame of the wall 



dT' ^ 1^ , A' ^ 
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where the shifted chemical potential difference is /t = /r — 2vgAZz, (/I) 
denotes the signed sum of chemical potentials for particles participating in 
the reaction, A = (/i) = (/{ — 2v^gAZz), is the difference between shifted 
L and L potentials, and prime denotes dz, a = 7t^/27C3, b = noTg/po, 
c = ln2/14C4, C 4 = tt^/90, uq = 3C3Tq/47t^, po = 21 C 4 Eo/ 87 r^ and C is the 
Riemann C-function. The derivation of these equations is simplest if one 
shifts the canonical momentum to kz = Pz+gAZz and the chemical potential 
to /t. In this way the correct massless limit emerges as one expands in powers 
of Zz- r„ is simply related to the diffusion constant H - it is easily seen that 
D = (noTo/4apo)r-i « ip-b In fact we find Pt « |r„,r« « T/24 [38], 
and are derived from hypercharge conserving chirality flip pro- 
cesses, such as those involving external Higgs particles. In this case, the Zz 
contribution to the sum of chemical potentials vanishes. P^ and P* are the 
rates for hypercharge violating chirality flip processes, which are sup- 
pressed and for these the Zz contribution does not cancel. These were the 
terms driving “spontaneous” baryogenesis, in its modified form [39]. 

We showed in [38] that over a wide range of possible parameter space a 
simple approximate solution holds to (8.13). In this solution, Sv and 6T are 
negligible and the chemical potential /t is simply related to the force term 
on the wall 



, 2 In 2 gAZzOrr 

p = — 



3C3 



T2 



on the wall 



(8.14) 



pL in front of the wall is then determined from a conservation law as follows. 
Integrating (8.12) and (8.13) (with all P^’s zero), gives f^ST « 0 and 
f^^Sv = 0. Then integrating (8.11) twice we find P = 0 i.e. no net 
integrated chemical potential perturbation is generated. This means that 
the chemical potential generated on the wall is compensated by an opposite 
chemical potential off it. As mentioned above, off the wall the equations for 
pi reduce to the diffusion equation, and it is straightforward to see that in 
the absence of particle number violation the only nontrivial solution for pi 
is a diffusion tail in front of the wall. This is where the chiral charge deficit 
occurs, which drives baryogenesis. 

Hence the integral of the chemical potential in front of the wall {z > 0) 
equals: 

/ dzfi=^—v^ dzgAZzm^. (8.15) 

^0 Jwall 

Now, using the formula for baryon number violation (7.1) 



3 r 

fiB = -v^Uq = - Pjj 



(8.16) 



where Pg = K{av/T)‘^ is the weak sphaleron rate in the unbroken phase, 
is the number of colors, k G [0.1,1] [7]. We have re-expressed p in terms 
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of top quark and antiquark chemical potentials. We arrive at a formula for 
the baryon to entropy ratio: 

riB 135 In 2 f m'^gAZ^ f m'^gAZ^ 

= j 1 

where s = is the entropy density, with (/* « 100 the 

number of degrees of freedom. 

This result is remarkably simple - all dependence on the wall veloc- 
ity, thickness and the diffusion constant drops out! It is also quite large: 
(mt/T)^ ~ 1, so ne/s ~ 4 X 1O“®k0cp where 0cp characterises the strength 
of the CP violation. In [38] we give a more detailed derivation of (8.17) with 
a full discussion of parameter dependences, including the effect of the P^ 
terms. 

The calculation of the classical force effect above uses the opposite 
(WKB) approximation to those employed in quantum mechanical reflection 
calculations (thin walls) [34, 40] . The classical force calculation is in some 
respects “cleaner” , because the production of chiral charge and its diffusion 
are treated together. The classical force affects particles from all parts of 
the spectrum, mostly with typical energies if ~ T, and with no preferential 
direction, while the quantum mechanical effect comes mainly from particles 
with a very definite ingoing momentum perpendicular to the wall: Pz ~ toh 
(H iggs mass). The quantum result falls off strongly with L (at least as T“^) 
as the WKB approximation becomes good. The quantum result also has a 
dependence coming from the diffusion time in the medium, which the 
classical result loses because the force term is proportional to Uw. 

It is interesting to compare this “classical force” effect with the local 
baryogenesis term mentioned above, equation (6.3). The two are paramet- 
rically identical, but the classical baryogenesis effect is larger by two factors 
of 7T^. The first comes about because the local term is a one loop effect, 
and the second because of the Matsubara frequency being ttT, an effect as 
mentioned earlier due to the Pauli exclusion principle. So it seems quite 
clear-cut that the classical baryogenesis effect is larger. I should also men- 
tion the “nonlocal spontaneous” term, discussed in detail in [38], which is of 
the same order as the classical force effect, although with a quite differeent 
and more complicated parametric dependence. 

To summarise, the minimal standard model looks to be unlikely as a 
consistent theory of baryogenesis, both because there is no phase transition 
and because CP violating effects are so small. But it is also clear that there 
are viable extensions of the standard model (such as two-Higgs theory) in 
which an acceptable baryon asymmetry can be generated. For a recent 
review of the viability of the MSSM, see [41]. But we should not forget that 
in the end verification of these ideas will undoubtedly involve experiment. 



(8.17) 

effective 
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9 Cosmic defects 

Now we turn to higher energy phenomena in cosmology, and the 
consequences of Grand Unification physics. As you are aware, there is a 
huge amount of freedom as well as uncertainty in constructing unified the- 
ories of the forces of nature - we simply have much too little data and 
despite recent progress in string theory and M theory, no compelling theo- 
retical model. However there are some generic consequences of Unification 
of the forces, one being the existence of cosmic defects. They are an almost 
inevitable consequence of unification and symmetry breaking. It is there- 
fore of great interest to understand their cosmological consequences, many 
of which are relatively independent of their microphysical origin [42,43]. 

A lot of interest in cosmic defects atteched to their potential to gener- 
ate structure in the universe. Many of the simplest grand unified theories 
predict cosmic strings [44,46], and simple family symmetry theories predict 
cosmic texture [47]. The amplitude of the primordial density perturba- 
tions Sp/p emerges naturally in these theories as the ratio (Mgut/A7pi)^, 
where Mqut is the scale at which the observed gauge couplings unify, of 
order 10“^Mpi. 



10 Unification and symmetry breaking 

Figure 3 provides one large piece of evidence for unification of the forces. 
If we extrapolate the experimentally measured strong, weak and electro- 
magnetic couplings to higher energies then they come remarkably close to 
unifying at a scale Mgut just above 10^® GeV. This is true with or without 
supersymmetry, but if we include supersymmetry at scales above a TeV or 
so, they come closer to meeting. The second major piece of evidence is in 
the special pattern of representations of SU{3) x SU{2) x U{1) that the 
fermions lie in. These representations fit neatly into minimal representa- 
tions of SU{5) and even better into a single representation of S'O(IO) or 
the groups. These two pieces of evidence point strongly towards uni- 

fication fo the forces at scale of around 10^® GeV. There has been much 
recent interest in the idea of large extra dimensions, with unification oc- 
curring at much lower energies. I think this is a very interesting proposal 
and well worth exploring, but the elegance and simplicity of the purely four 
dimensional account of unification is unfortunately lost. 

If one is bold (or naive, depending on your point of view) and extrapo- 
lates the hot Big Bang right back to temperatures of Mgut, at high tem- 
peratures one might expect the Grand Unified symmetry to be restored. As 
the universe cooled, the symmetry would be spontaneously broken. In this 
situation, quite generically, cosmic defects would form. 
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Fig. 3. Unification of couplings. 



The existence of cosmic defects is determined by the structure of the 
set of minima of the Higgs potential, the classical vacuum manifold. As a 
familiar example consider the standard model and neglect the complication 
of hypercharge. The group SU{2) is broken by a Higgs doublet 4). The 
vacuum manifold Vo is the set of minima of the potential, in this case the 
set of 4> obeying d)!*!) = with v = 250 GeV. This manifold is easily seen 
to be a three sphere. In general, if one assumes no accidential degeneracy, 
one expects the vacuum manifold to be the set {g4>o ■ g^G} where </>o is 
a particular choice for 4> which minimises the potential. In general, 4>o 
has a little group H being the elements of G which leave (f>o unchanged: 
H = {geG : g<j)o = (j)o}- It is then easy to see that two elements of G will 
produce the same point on the vacuum manifold if and only if they are 
related by right multiplication by an element of H . Thus in the absence of 
accidental extra symmetry, the vacuum manifold is topologically equivalent 
to the coset space G/H, where H is the unbroken subgroup of G [42]. In 
simple cases, Vo actually has the same metric as that on G/H . In general 
however there will be a different metric on the space of fields, induced by 
the kinetic term in the Lagrangian. 

11 Homotopy and topology 

In the case of grand unification, as in Figure 3, we are interested in 
determining whether cosmic defects exist. This depends on the topology 
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of U, ~ G/H, and more particularly on its homotopy groups 7t„. If G/H is 
disconnected (ttq nontrivial) there are a discrete set of distinct vacua and one 
obtains domain wall solutions. If G/iJ is not simply connected (tti nontriv- 
ial) one obtains cosmic strings because as one follows a path around a closed 
loop in the vacuum, the fields may traverse Vo along a noncontractible loop. 
If G/H has a nontrivial 7T2 then one obtains pointlike monopole defects 
where the fields wrap around a noncontractible two-sphere as one encloses 
a region with a closed two dimensional surface. Finally, if G/H has a non- 
trivial 7T3 then there are still topological defects known as texture. These 
possibilities are illustrated in Figure 4. 

12 Existence of defects 

There are some general theorems which help one establish whether there 
are cosmic defects in a given Grand Unified theory. These are illustrated 
in Figure 5. 

First, if a simple group G is broken to a group containing a U(l) factor, 
then tt 2 {G / H) is nontrivial and magnetic monopoles form. This is the case 
of most interest in Grand Unification, since we are interested in obtaining 
H ~ SU (3) X C/(l) being the unbroken symmetry group of the standard 
model. One concludes that magnetic monopoles are a mandatory conse- 
quence of grand unification, so that the formation of magnetic monopoles 
as the universe cooled through the GUT temperature is certainly to be ex- 
pected on the basis of a naive extrapolation of the Big Bang back to very 
early times. Second, if G is broken to a group containing a discrete factor, 
then TiiiG/H) is nontrivial and cosmic strings form. Whilst their forma- 
tion is not mandatory, they are nevertheless ubiquitous in realistic GUT 
models. The simplest SU{b) theory does not predict them but the S'O(IO) 
theory, with the Higgs needed to give the right handed neutrino a large 
mass, does. A general analysis of this kind of example was given in [46]. 
Finally, if a simple group G is broken completely so that H = 1, then 
tt^{G/ H) = 7T3(G), and this is nontrivial for any simple G. In this situation 
one produces texture. This happens both in the electroweak theory, where 
the texture is gauged, and in chiral symmetry breaking where the texture 
defects are known as Skyrmions. Neither case is interesting for cosmic struc- 
ture formation. Gauged texture can relax to a local fV-vacuum, as explained 
above. And chiral symmetry is only an approximate symmetry. For texture 
to be interesting for structure formation, the broken symmetry has to be 
global and very nearlu exact. One example of where this might occur is in 
theories of continuous family symmetry, where the groups SU{2) or SU{3) 
are relevant. The latter case is interesting since G has to be global rather 
than gauged (if gauged it would be anomalous) and the texture produced 
is of the type relevant for structure formation [47, 48] . 
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Fig. 4. Cosmic defects. 
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Fig. 5. Pictorial explanation of homotopy relations discussed in the text. Assume 
G is connected and simply connected i.e. has no noncontractible loops. If G is 
broken to a subgroup H, the vacuum manifold is G/H which is obtained from G by 
identifying elements related by right multiplication under the unbroken subgroup 
H. G/H can be though of as the manifold obtained from G by shrinking 77 to a 
point. If H is not connected as shown on the left, with components H and H' , then 
shrinking H produces a “tube” in G/H which therefore has a noncontractible loop. 
If H is not simply connected as shown on the right then G/H has a noncontractible 
two sphere. For a real proof see for example [45]. 



13 Low-energy actions 

Once the symmetry is broken, cosmic defects survive as remnants of the 
original unbroken phase surrounded by regions of broken phase vacuum. It 
is plausible that dynamics of the defects is governed by an effective action 
for low energy degrees of freedom. For example, to bend a very long string 
by a small amount costs little energy, so such bending modes should be 
allowed in the low energy action. It is not hard to see that the action for 
the simplest type of cosmic defects with no internal structure should be 
just the geometrical Nambu-Goto action, the area of the world sheet or the 
world volume swept out by the defect [52] . In order to be dimensionless the 
action must also involve a constant, which for a string is just the energy per 
unit length /i ~ 

S = J ^ab = dAX^{a)dBx''{a)gf,^{x°'{a)) (13.1) 

where the spacetime coordinates of the string worldsheet are 
= 0, 1, 2, 3 and the world sheet coordinates are a^, A = 0,1. The two by 
two matrix ■’jab is the induced metric on the string worldsheet. 

For the case of broken continuous symmetries, the relevant degrees of 
freedom at low energies are the Goldstone modes: these live on the manifold 




468 



The Primordial Universe 



Vq and the kinetic term for the scalar fields provides a metric gu on that 
manifold. The action may be written in a coordinate invariant manner: 

S = Mqut J (13.2) 



where is the background spacetime metric and the (j>^ are a suitable set 
of coordinates for Vq, and the metric on Vq is 71 J. In the simplest examples, 
Vb is a sphere and we may use Cartesian coordinates 4>°', a = 1...N + 1, 
provided we impose the constraint (jy^ = (jj^. This is conveniently done by 
introducing a Lagrange multiplier field. 

An attractive feature of these low energy effective actions is that the 
equations of motion for the defects do not involve parameters beyond those 
needed to describe the vacuum manifold. In the simplest cases, the vacuum 
manifold is maximally symmetric and there are no free parameters in the 
dynamics at all. The simplest expectation is that if the defects are pro- 
duced in a quench, with no long range correlations, then after a time t the 
correlation length will be proportional to the speed of light times t. This 
is the assumption of “scaling” . If it is correct, then since the stress energy 
tensor involves an overall factor of ^ ~ -^gut ^or strings and cr ~ 
for walls. One expects the energy density of the defects to be given by 



/^strings, textures 



-^GUT . 

’ 



Pwalls 



M(|ut 



(13.3) 



The argument that cosmic defects could be interesting for structure forma- 
tion runs as follows. Since the energy density in the background universe 
is approximately given by pe ~ l/(30Gt^), strings or textures form a fixed 
fraction of the total energy density and as the scale on the defect network 
grows the network imprints a perturbation on the background universe of 
order pdefects/pB ~ 30GMQurp. One would like this perturbation to be of 
order the amplitude of perturbations detected by COBE, or 10“®, and this 
is so provided Mgut is of order 10^® GeV which is close to what is expected 
from particle physics (Fig. 3). This is a nice coincidence. 



14 Scaling 

The naive expectation of scaling is confirmed by simulations of textures, and 
by an exact solution of the 0{N) texture model in the large N limit [53] . But 
for strings the situation is considerably more complicated, in that strings 
preserve some degree of small scale structure even at late times. Figure XX 
shows a box of strings evolved in flat spacetime using an exact algorithm 
in Minkowski spacetime. It shows only the long strings - there is a “fine 
dust” of small loops present which have been chopped off the newtork, and 
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which would ultimately decay. The long strings remain quite jagged as the 
simulation proceeds, even thught the network appears to exhibit scaling. 

In the standard picture the assumption was that loops or kinks on the 
long strings would eventually decay via the emission of gravity waves. This 
has recently been questioned Hindmarsh and collaborators who returned to 
the full classical field theory to perform very large scale parallel simulations. 
They found the Nambu approximation to be faulty even at late times, be- 
cause the string continually regenerates small scale structure on the order 
of its width at all times. Even though the Nambu action breaks down the 
simulations are nevertheless consistent with scaling. Hindmarsh et al. con- 
jecture that for gauged strings the dominant energy loss mechanism to the 
string network would be to high energy particles produced from the small 
scale structure on the strings. If this is correct, then the observational lim- 
its on high energy (1 — 10 GeV) cosmic rays are enough to rule out GUT 
scale strings as the source for structure formation in the universe [54]. It 
is not yet clear this argument is reliable, since it relies upon an enormous 
extrapolation of the numerical simulations for which there is little analytic 
justification. Global strings might to be less susceptible to this problem, 
since they radiate predominantly into Goldstone bosons which couple only 
very weakly to ordinary matter. 

15 n in the sky 

The best way to detect macroscopic cosmic defects would be to see them in 
the pattern of fluctuations on the cosmic microwave sky. The most famous 
effect is the Kaiser-Stebbins effect whereby a moving cosmic string produces 
a linear discontinuity on the sky [59] . This effect is produced by the gravita- 
tional field of the moving string, which tends to “slingshot” photons on the 
trailing side, causing a blueshift there proportional to the GMq^j,v where 
V is the string’s speed perpendicular to the line of sight. Unfortunately 
it is difficult to observe these discontinuties except at very high resolution 
because they are masked by a larger background of fluctuations due to the 
acoustic oscillations in the photon-baryon fluid sourced by the strings at 
early times. 

Nevertheless the pattern of microwave anisotropies produced by the cos- 
mic defects is highly distinctive, and very different from that produced by 
inflation. The key differences are that the defect-produced ffuctuations are: 



• Gausal in the standard hot Big Bang sense. Namely the density 
perturbations at two spacetime points whose backward light cones 
do not overlap are strictly zero; 

• Nonlinear, with a complete absence of long wavelength perturbations 
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Fig. 6. A scaling string network in flat spacetime. 

in the initial conditions. This is most conveniently described using a 
pseudo-energy conservation law [63,64]. The long wavelength modes 
are generated via the transfer of power from short wavelengths through 
nonlinearity; 

• Related to the nonlinearity, the perturbations are nonGaussian [60]. 
For a Gaussian random field, all statistical correlators are determined 
from the two point function. For a nonGaussian random field, some 
or all of the irreducible higher correlations are nonzero. (If just one is 
nonzero, one can show that there must be an infinite number which 
are nonzero.) The pictures of the microwave anisotropies produced 
by defects, shown in Figure 2 are non Gaussian, as is clear from the 
existence of distinctive hot spots; 

• Also related to the nonlinearity, the perturbations are incoherent [65] . 
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?Inf liation? 



Fig. 7. Causality in the hot Big Bang. Any perturbations that are generated 
after a phase transition at conformal time rpx must be strictly uncorrelated for 
two spacetime points whose backward light cones do not overlap for r > rpx. 



In a theory where there is no driving force at late times, then the late 
time evolution of every Fourier mode of wavenumber k is identical. 
Any decaying mode components vanish rapidly and at late times they 
all follow a single growing mode evolution. However in defect theories 
the perturbations are produced at late times by a driving force. In 
this case, every wave vector k has an independent history for the per- 
turbations. This means that when one computes the power spectrum, 
which involves averaging over all k, the coherent oscillations seen in 
th standard theories are smeared out, as shown in Figure 9 below. 

These differences have inspired a lot of recent work, leading to a better ap- 
preciation of what is special about the coherent perturbations produced by 
simple inflationary models. In consequence, even though the simplest defect 
models seem clearly ruled out, they have served a very useful educational 
role. 

16 Precision calculations 

After many years of struggling with the nonlinearity of defect theories, 
a recent breakthrough in our ability to accurately calculate their predic- 
tions was made with with the development of techniques which properly 
incorporated all of the above effects [61,62]. These techniques may well 
prove useful in other fields. 

After Fourier transforming with respect to the comoving spatial coordi- 
nate, the linearised Einstein equations take the form 

£fe<5(k,r) =0(k,r) (16.1) 

where Ck is an ordinary differential operator in r, the conformal time. 
5(k, t) is a column vector composed of the metric and fluid perturbations 
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Fig. 8. Subdegree maps of the microwave anisotropies from cosmic defects. The 
maps show only the intrinsic and Doppler anisotropies, which dominate on small 
angular scales. The size of the maps is ten degrees on a side, for a flat uni- 
verse. The most striking feature is the presence of hot spots caused by the defects 
attracting concentrations in the the photon baryon fluid about them [60]. 



for each wavevector k, and 0(k,r) is the stress energy tensor of the defect 
source which acts as a driving term in the Einstein-matter equations. A 
number of matrix indices are suppressed in this equation. The solution to 
(16.1) is expressible via a Greens function: 

5(k,r)= [ d/G'fc(r, r')©(k,/) (16.2) 

^0 
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where r is an appropriate time coordinate. From this one can show that 
the general two point correlation function of any perturbation variables at 
any two spacetime points is expressible in terms of unequal time correlation 
function of the defect stress energy: 



(0^^(k, r)0pA(-k, t')) = Cp^,pA(fc, T, t'). 



(16.3) 
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Note that by translation dependence, this, like Gfe(r, r'), depends only on 
the magnitude of k and not on its direction. 

The unequal time correlators (UETCs) are constrained by causality, scal- 
ing and stress energy conservation. Causality means that the real space cor- 
relators of the fluctuating part of (which are just the Fourier transform 
of (16.3)) must be zero for r > r-|-r' [61] (Fig. 7). Scaling [62] dictates that 
in the pure matter or radiation eras 

oc 0Q/(rr')^c^i.,pA(fcr, /cr'), (16-4) 

where 4>o is the symmetry breaking scale and c is a scaling function. Fi- 
nally, energy and momentum conservation (see e.g. [62]) provide two linear 
differential equations which the four scalar components must satisfy. 

The idea of our method is to measure the UFTCs in large simulations, 
once scaling has set in, and then re-express them as a sum over eigenvector 
products: 

C{k, T, t') = X‘v'‘{k, T)v'^{k, t'), (16.5) 

i 

where 

J dr'C(fc, T, r')u*(fc, r')'u;(r') = A*r!*(/c, r). (16.6) 

The indices labelling the components of the stress tensor are implicit. We 
have found that the first 15 eigenvectors typically reproduce the UETCs to 
better than ten per cent: the effect of including more than 15 on the final 
perturbation power spectra is negligible at the few per cent level. 

There are many virtues of this representation, described in [62]. Cru- 
cially, it avoids the so called “compensation problem” which had plagued 
earlier treatments of defect induced perturbations. Once one has the source 
eigenvectors one can feed them into a Boltzmann solver and compute the mi- 
crowave anisotropies and large scale matter power spectrum in a completely 
self-consistent manner (Fig. 9). Whereas previous methods had large sys- 
tematic effects which limited them to accuracies of at best a factor of two, 
the current method gives results reliable to 10 per cent. An independent 
replication of the method by the Geneva group has successfully reproduced 
our results [55]. 

17 Refutation 

The outcome of these calculations is that the simplest theories fail badly 
with respect to the observations. There are two problems, illustrated in 
Figures 10 and 11 respectively. The first is that the defect theories do not 
produce the “Doppler peak” , now seen clearly in several cosmic microwave 
background experiments, showing a strong rise at degree scales. The reasons 
for this are twofold. First, the incoherence of the perturbations mentioned 
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Fig. 9. The cosmic microwave anisotropies produced by global strings in the 
eigenvector expansion. Each of the lower curves represents the contribntion of 
a single eigenvector in the sum (16.6). The upper curve is the sum. As can be 
seen, whilst each eigenvector produces a set of Doppler peaks, the final sum is 
remarkably smooth. This curve shows the scalar power spectrum, those in (10) 
show the entire spectrum convergence of the sum is quite rapid and the final C\ 
spectrum is relatively featureless. 



above. Second, the inevitable presence of vector and tensor perturbations. 
This is actually required by causality [56]. For textures, there actually is 
a pronounced Doppler peak in the scalar C\ but this disappears when one 
includes the tensor and vector contributions. 

The second is that if the theories are normalised to COBE, they fail to 
produce adequate density perturbations on galaxy clustering scales 
(10 — 100 Mpc). The conventionally defined measure of perturbations on 
the scale of galaxy clustering, erg, is too low, and the problem gets worse at 
larger scales. The second is that the theories show no significant Doppler 
peak at I ^ 200 — 300 as seems to be indicated in the cosmic microwave 
anisotropy data. In the global defect theories. Pen, Seljak and me found 
that these problems are not alleviated by the usual fixes for inflationary 
models - such as introducing a cosmological constant, making the universe 
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Fig. 10. Predictions of the global defect theories (curves) for the cosmic microwave 
anisotropy are compared to the observations (error bars). 



open, or adding a hot dark matter component. Avelino et al. [57] and 
Battye et al. [66] claim on the basis of more approximate calculations that 
the situation is improved for local cosmic strings in Lambda or open uni- 
verses. They correctly point out that lowering ^matter causes the power 
spectrum to shift to longer wavelengths by a factor and this im- 

proves the shape of the power spectrum relative to the observations. How- 
ever there is still a problem with the amplitude of the rms mass fluctuations, 
SM/M. Shifting the matter power spectrum increases SM/M at most by 
^mLten even if we make the most optimistic assumption, that the power 
spectrum P{k) cx k. This is offset by the loss in growth proportional to 
~ for a flat (Lambda) universe, ~ for an open one. One of 

the observations we need to fit is the rms mass fluctuation on 8h~^ Mpc, 
estimated from the abundance of galaxy clusters to be as ~ .570“°^®®,.. The 
number comes from simulations of Gaussian models, but at least with cold 
dark matter cosmic strings are likely to be quite Gaussian on these rela- 
tively small scales (since the perturbations come from many generations of 
horizon scale strings each producing roughly equal amplitude fluctuations 
in the radiation era). Gomparing the two dependences, one sees that the 
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k[hMpc"'] 



Fig. 11. Predictions of the global defect theories for the matter power spectrum 
are compared to the observations (error bars). The upper curve shows the pre- 
diction for standard inflation plus cold dark matter, normalised to COBE. 



gain in matching erg is at most in the flat universe. Shellard and 

others argue that the change in the string density from the radiation to 
matter transition could be of assistance, but even including this change, 
Battye et al. obtain erg ~ 0.4 in a flat universe with Umatter = 0.3, well 
below tjg ~ 1.1 which the cluster abundances would require. 

My conclusion is that if current interpretations of the data are correct, 
the simplest defect theories seem in bad shape, and some fairly drastic 
change will be needed to restore them to good health. One possibility is a 
nonminimal Ricci coupling for the Goldstone bosons, as has been investi- 
gated by Pen (1998) [91]. 

Whilst the failure of the simplest defect theories is disappointing, it 
is important to emphasise that in many unified theories cosmic defects 
form at intermediate scales - anywhere from the electroweak to the Planck 
scale. Defects formed at scales well below below Mgut would play only 
a minor role in structure formation, but they may well be there. Regard- 
less of whether they formed cosmic structure or not, the detection of a 
cosmic defect would have tremendous significance for fundamental theory. 
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It is important that we continue to look for ways to to search our horizon 
volume for them, via their signatures in the microwave sky, as gravitational 
lenses or as sources for very high energy cosmic rays. 



18 Instantons and the beginning 

I am now going to turn to a discussion of inflation, and our recent work 
(Hawking and Turok 1998a) linking the no boundary proposal (Hartle and 
Hawking 1983) to open inflation (Gott 1982; Bucher et al. 1995; Yamamoto 
et al. 1995). 

Inflation (Guth 1981; Linde 1982; Albrecht and Steinhardt 1982) is a 
very appealing cosmological theory. It is based upon an old, simple ob- 
servation due to de Sitter and others that a cosmological constant causes 
exponential expansion of the universe. 

Inflation seeks to involke this exponential expansion to explain the 
following fundamental puzzles regarding the initial conditions of the hot 
Big Bang. 

• Where did the hot matter come from? 

• Why was it expanding? (What went bang?!) 

• Why were the initial conditions so uniform and flat? 

Inflation is based on the observation that scalar (“inflaton”) fields have 
a potential V {4>) which can provide a “temporary” cosmological constant, 
driving an exponential expansion of the universe. This makes the universe 
very smooth and flat. As the inflaton (j) rolls down to its minimum, the 
potential energy is converted into matter and radiation. The geometry is 
nearly flat: this requires that the universe be expanding at a rate given by 
the density. 

The inflationary solution is beguilingly simple, but it raises many new 
puzzles which are still unresolved: 

• What is the inflaton (f> and what is V{(j))l 

Unlike cosmic defects, inflation does not appear to mesh nicely with 
unification. To obtain the correct level of density perturbations in 
today’s universe one requires very flat potentials, characterised by 
a dimensionless coupling A ~ 10“^"^. In the simplest grand unified 
theories there is no reason for scalar self-couplings to be so small - on 
the contrary this requires ugly fine tuning. Supersymmetric theories 
offer the prospect of new fields, “moduli”, with very flat potentials, 
but additional contrivance is needed to prevent the moduli excitations 
from later dominating the universe. 
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• Why was the inflaton initially displaced from its potential energy 
minimum? 

Initially it was hoped that inflation would result from a supercooled 
phase transition with the inflation field as the order parameter. But 
the inflaton field is so weakly coupled that it is hard to understand 
how thermal couplings to other fields could localise it away from its 
potential minimum. 

• What came before inflation? 

Whether inflation happened at all, and how effective it was in flat- 
tening and smoothing the universe, depends on the initial conditions 
prior to inflation. It is often argued that the measure on the space of 
initial conditions does not really matter, because inflation is such a 
powerful (exponential) effect that almost any initial conditions will 
end up being dominated by regions of the universe which are in- 
ffating. One of the interesting things about the specific measures 
proposed (Hartle-Hawking or tunnelling wavefunctions) is that they 
involve much greater exponentials, which typically dwarf the exponen- 
tial increase of proper volume during inflation. At the very least these 
measures demonstrate that the initial conditions do matter, and the 
issue cannot be ignored. 

• How do we avoid the initial singularity? 

Inflation offers no solution to the singularity problem, certainly not 
if it was was preceded by a hot early phase, or by primordial chaos. 
Open inflation and the no boundary proposal do offer a solution, as I 
will describe below. 

To address the last two puzzles, one needs a theory of the initial conditions 
of the universe. Usually we regard the universe as a dynamical system, 
in which we need to specify the initial conditions as well as the evolu- 
tionary laws. This is true quantum mechanically just as it is classically. 
But it is unsatisfactory that there should be a separate input required, on 
top of the laws of nature, to specify the beginning of the universe. We 
could avoid the problem if the laws of physics defined their own initial 
conditions. Something like what we want occurs in thermal equilibrium 
- the state of the system is completely specified by the Hamiltonian, and 
one just “integrates over” all possible states. We would like an analogous 
prescription for cosmology. 

Imagine constructing a set of initial conditions for cosmology. Consider 
a universe which is topologically a three sphere but with an arbitrary three 
metric, scalar and other matter fields, and their momenta. What distri- 
bution should we assume for all these fields? The simplest choice would 
be a Boltzmann distribution at a specified temperature T. But then we 
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would have T as an unwanted arbitrary parameter. Worse, due to the Jeans 
instability, in the presence of gravity the thermal ensemble doesn’t exist. 

The Hartle-Hawking proposal is nevertheless close in spirit to the ther- 
mal ensemble. One defines the quantum amplitude for a given three metric 
and scalar field configuration 



4'(g^,(/)) cx 



C9 



V(j)Vg^e^ 



(18.1) 



where the lower limit of the integral is defined by continuing to Euclidean 
time and compact Euclidean metrics. All such metrics and the associated 
fields are to be “integrated over” , so that there is no additional information 
needed to specify the integral. The prescription is a natural generalisation 
of the imaginary time (Matsubara) formalism in statistical physics - one 
can show that correlators calculated from (18.1) are given by a Euclidean 
path integral with no boundary, just as thermal correlators are. In thermal 
physics the size of the Euclidean region determines the temperature T: in 
the Hartle-Hawking prescription the size of the Euclidean region is dynam- 
ically determined by gravity, and this fixes the the effective temperature to 
be the Hawking temperature of the associated background. This is nicely 
self-referential. 

The expression (18.1) is only formal and as it stands ill defined. The 
only known way of implementing the Hartle-Hawking proposal in practice 
is to find suitable saddle points of the path integral (z.e. classical instantons 
and their Lorentzian continuations) and expand around these to determine 
the Gaussian measure for the fluctuations. So one writes 



rg ,<!> 



V(j)Vg^e^‘ 



SE(inst) 



rg ,4> 



V5(!)V5g'^e 



3„iS2 



(18.2) 



where S 2 is the quadratic action for small fiuctuations, and S'E(inst) is the 
Euclidean action of the instanton. In this approximation we can compute all 
correlators of fields and fiuctuations, and to second order we have complete 
quantum information about the universe. The quadratic fluctuation corre- 
lators are most easily calculated in the Euclidean region as well, and then 
analytically continued to the Lorentzian region, as in thermal physics. This 
is a very elegant formalism ~ literally everything is computed inside the mi- 
croscopic Euclidean instanton (the “pea”!), and then analytically continued 
into the real Lorentzian universe. 

The new development was the realisation that a broader class of instan- 
tons than had previously been thought could contribute to the path inte- 
gral. These new instantons naturally give rise to open inflation, previously 
thought to be a somewhat artificial special case. 

The “old” way of obtaining open inflation is illustrated in Figures 3 
and 4. One assumes that for some reason (primeval chaos?) a scalar field 
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Fig. 12. Scalar field potential for “old” open inflation. 

Open Inflating Universe 




Fig. 13. Spacetime pictnre showing an open inflationary bubble. 



(j) was trapped in a “false minimum” of its potential, driving a period of 
exponential expansion which solves the smoothness and flatness puzzles. 
This period of “old” inflation (Guth 1991) terminates with the nucleation of 
a bubble. Bubble nucleation is described in the semiclassical approximation 
using a Euclidean instanton representing the tunnelling of the scalar held 
through the the barrier between the “false vacuum” and the “slow roll” 
part of the scalar held potential. The calculation was first described in the 
gravitational context by Coleman and De Luccia (1980). 

Bucher, Goldhaber and I showed that the value of U today inside such a 
bubble is determined by the distance 4>o that the inflaton held rolls during 
the period of inflation inside the bubble: 

Uo = (l + yfe-2-^)”^ (18.3) 

where A represents the factor by which U deviates from unity during the 
post-inflationary epoch. This depends on the temperature to which the 
universe is heated immediately following the inflationary era i.e. at the 
start of the “standard hot Big Bang”. If the universe is heated to the 
electroweak temperature, A ^ 10^°, whereas if it is heated to the GUT 
temperature, A ^ 10®'^. The quantity J\f is the number of inflationary 
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e-foldings: approximating the potential as quadratic during the slow roll 
phase one has Af « j{4>o/Mpi)^ where Mpi = {8nG)~^ is the reduced 
Planck mass. 

In order to obtain an interesting value of 0.1 < Uq < 0.9 today, one 
requires 

^logA - 1 < < ^logA + 1 (18.4) 

which requires tuning of (j)o/Mpi to a few per cent. Note that the tuning is 
only logarithmic in the large number A, so it is not extreme. 

The “old” approach to open inflation served as an existence proof that 
inflation could produce an observationally interesting open universe. In 
addition, two remarkable properties should be noted. First, an infinite open 
universe emerges inside a bubble which at any fixed time appears finite. This 
is allowed by general relativity because the volume of the universe is that 
of the constant density hypersurfaces, and these need not correspond to the 
timeslicing as defined by an observer external to the bubble. Second, the 
singularity of the hot Big Bang (the light cone labelled t = 0 in Fig. 4) 
is just a coordinate singularity! This is possible only for an open universe 
which is potential dominated at early times. It is very interesting that the 
problem of the initial singularity does in this sense also point to a phase of 
potential domination prior to the standard hot Big Bang. 

But “old” open inflation has some unattractive features. First, the log- 
arithmic fine tuning mentioned above. Worse, the Coleman de Luccia in- 
stanton only exists for large values of the curvature (second derivative) of 
the scalar held potential around the tunnelling barrier. The reason for this 
is that the width of the bubble wall is set by the curvature scale (effective 
mass squared m?) of the potential. The width m~^ must be smaller than 
the size of the de Sitter space throat where H is the Hubble constant 
in the false vacuum, or else the bubble just doesn’t fit inside the de Sitter 
space. This condition is only met if the potential has a “sharp” false min- 
imum, a very contrived situation. There have been attempts to And more 
“generic” open inflationary models, by introducing more fields (Linde and 
Mehzlumian 1995) but it is hard to quantify the extent to which fine tun- 
ing has been replaced by contrivance. The physics of these models is quite 
complicated (see Garcia-Bellido et al. 1998 and Vilenkin 1998c for reviews) 
and as usual with inflationary models the initial conditions problem is only 
exacerbated by introducing more fields. 

I am going to describe a different approach which allows for open in- 
flation in essentially any potential flat enough to allow significant inflation. 
Until recently it was thought that the only instantons which when ana- 
lytically continued gave real Lorentzian universes were the Coleman De 
Luccia instanton and a second, simpler instanton called the Hawking Moss 
instanton (Hawking and Moss 1983). The latter occurs when the scalar 
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Fig. 14. Obtaining de Sitter space from the Hawking-Moss instanton. 



potential has a positive extremum. In that case there is a solution in which 
the scalar field is constant, and the metric is that for de Sitter space with the 
appropriate cosmological constant. This solution is illustrated in Figure 5. 

This produces an inflating universe, but in the first approximation it is 
perfect de Sitter space, with too much symmetry - 0(4, 1) - to describe 
our universe. The de Sitter space is highly unstable, and (j) would rapidly 
fall off the maximum of the potential. In fact, after (j) rolled down and 
reheated the universe, the final state would be very inhomogeneous. If 
there were sufficient inflation, the inhomogeneities would be on unobservably 
large scales, but the universe would also be very flat and there would be no 
observational signature of its beginnings. 



19 Singular instantons 



Hawking and I considered instead a “generic” potential without extrema 
or false vacua. In this case, there is no 0(5) invariant solution to the 
Euclidean field equations simply because (p cannot remain constant if its 
potential U(</>) has a slope. This is our first observation - the maximal 
symmetry for an instanton in a “generic” potential is 0(4). When continued 
to the Lorentzian region, this becomes 0(3, 1), precisely what one needs for 
a homogeneous, isotropic but time dependent classical open universe. 

There exists not just one such solution but a one-parameter family of 
them, labelled by the starting value of the scalar field po . These solutions are 
singular but remarkably they have finite action and can therefore contribute 
to the Euclidean path integral. 

In the Euclidean region, the equations governing the instanton solution 



are 



P" + 3-P' = V^{P) 



b" = -^b{p'^ + V{P)) (19.1) 



3 
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Fig. 15. Potential for “generic” open inflation. 




Open 
Universe 

Instanton 



Fig. 16. Complete spacetime for open instantons. 



where the Euclidean metric is ds^ = dcr^ + with dU| the three 

sphere metric. The field (f> evolves in the ‘upside down’ potential — V {(p). We 
start from the assumption that the point cr = 0 is a regular (analytic) point 
of the solution - this will correspond to the vertex of the light cone containing 
the open universe. It follows that p ~ cj>o+ca'^+ ■■■ and b ~ cr+cV^ + ..., with 
c and c' constants. Assuming the slope of the potential is small, cj>' is small 
and the solution is close to a four sphere for most of the range of cr, with 
b ~ sin{Ha). At larger a however, the damping term b' /b ~ cot{Ha) 
changes sign so that the motion of p is antidamped. Nothing can now stop 
(j) from running away as b vanishes. For the fiat potentials of interest, one 
sees that p' ~ b~^ and so b{a) vanishes as (cTmax — cr)3, and p diverges 
logarithmically. There is a curvature singularity at cr = CTmax- 

These instantons are straightforwardly continued into the Lorentzian 
region, to obtain the spacetime sketched in Figure 7. 

The infinite open universe occurs on the right of Figure 7 - seen from 
that side it would appear as in Figure 4 - bounded by the light cone de- 
lineated with the dashed lines. On the left is the singular boundary of 
the Lorentzian spacetime. We have not avoided the singularity problem 
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of the standard Big Bang altogther, but have instead “sidestepped it”, be- 
cause all the correlators of interest in the open universe are computed in the 
Euclidean region and these can be analytically continued into the Lorentzian 
region. The singularity occurs as a boundary of zero proper size to the 
Euclidean region. A detailed investigation [75] shows that the field and 
metric fluctuations are perfectly well defined in the presence of the singu- 
larity, and the singular boundary behaves as a perfect reflector. 

Are such singular instantons allowed? The first question is whether they 
can contribute to the Euclidean path integral, z. e. whether they have finite 
Euclidean action. Surprisingly, they do. This is so because the Einstein 
action is not positive semidefinite, and the divergence in gradient energy of 
the scalar held is cancelled by an equal and opposite divergence in the Ricci 
scalar. This is essentially the same effect that allows inflation to produce 
arbitrary amounts of energy - the energy in matter is compensated by the 
negative energy of the grabitational held. 

The Euclidean action turns out to be well approximated by (Turok and 
Hawking 1998) 



127t2M^i \/fA7piU0((/)o) 



(19.2) 



The second term is a contribution coming from the singular boundary of the 
instanton (Gibbons and Hawking 1977). It is negligible in monotonic flat 
potentials, but has some interesting effects near potential maxima (Bousso 
and Linde 1998; Turok and Hawking 1998). 

There is a simple argument for the negative sign of the instanton action. 
The Euclidean action is 



S'e ~ Vol(-R -k A) ~ -k a'^A (19.3) 

where Vol is the volume of the instanton, R ~ a~^ the Ricci scalar, and 
A the effective cosmological constant. The potential for the size a of the 
instanton is of the familiar Higgs form - the negative sign of the Einstein 
action favours ~ Mp^/A, from which S'e ~ — Mpj/A. 

From the path integral formula (18.2), it follows that the probability 
of having such an instanton is proportional to ^ g+Mp,/A^ This 

means that the most likely beginning to the universe is for the effective 
A to be very small, which means there is no inflation. The initial size of 
the universe a is favoured to be very large. Linde (1998) argues this to be 
implausible - surely it is easier to create a small universe than a big one 
“from nothing”. Personally, I don’t view this as “creation from nothing” 
(which I am not sure makes any sense!), but rather I see the Hartle Hawking 
prescription as a measure for the set of initial conditions. One can show 
that the entropy associated with de Sitter space (the area of the de Sitter 
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horizon) scales just as negative of the Euclidean action, suggesting that the 
reason why the large universes dominate is just that there are just more 
states available for them. 

As mentioned above, the exponent is tremendously large: for a theory 
with V = the probability 

P (X ~ (19.4) 

where Af is the number of e-foldings. This exponent drastically outweighs 
that due to the increase in proper volume in inflation, which is a puny e^"^! 
However, as we discussed in my lectures at Les Houches, if one multiplies the 
two factors, then at large enough Af the volume factor wins. Remarkably, 
the volume factor begins to win at just the point where “eternal inflation” is 
alleged to occur, when quantum fluctuations of the scalar field overcome the 
classical rolling down the hill. If this multiplication can be justified then the 
predictions of the simplest inflationary models could indeed be that Af 1, 
with inflation starting at the Planck scale. 

Ignoring this possibility for the moment, there are three other ways 
out. Either the scalar held theory is wrong, the Hartle-Hawking ansatz 
is wrong, or we are just asking the wrong question. Let us begin with 
the last possibility. Perhaps it is too much to ask that the theory only 
produces universes like ours. If it allows other, uninhabitable, universes we 
should exclude them from consideration. Just as we are not surprised to 
And ourselves living on the surface of a planet, rather than in empty space. 

We live in a galaxy i.e. a nonlinear collapsed region, which was likely 
essential to the formation of heavier elements and life. Let us therefore 
condition the prior probability given by (19.4) by the requirement that 
there is a galaxy at our spatial location. Such a conditioning is of course a 
version of the anthropic principle, but a very minimal version. I prefer to 
see it as just using up some of the data we know about the universe {e.g. 
that we live in a galaxy) in order to explain other data. We do so in the 
hope that a more complete theory of the origin of life will explain why only 
universes containing galaxies should matter. 

Hawking and I imposed the “anthropic” requirement via Bayes theorem: 

7^(Uo|gal) (X ■p(gal|Uo)'P(Uo) oc exp 25'e(</>o) j (19.5) 

where the first factor is the probability that a galaxy sized region about us 
underwent gravitational collapse. This is Gaussian because we are assuming 
the usual inflationary fluctuations. The rms perturbation cTgai is the rms 
amplitude for fluctuations on the galaxy scale, equal to the perturbation 
amplitude at Hubble radius crossing for the galaxy scale A((()gai) multiplied 
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by the growth factor G(Uo). The latter is strongly Uq dependent. The quan- 
tity Sc is the linear theory amplitude of a perturbation when it undergoes 
collapse. 

The idea is very simple. The Euclidean action favours a very small 
number of e-foldings, which from the formula (18.3) means very small Uq. 
However galaxy formation requires the growth of density perturbations, and 
this does not occur unless Uq is substantial. For Gaussian perturbations (as 
inflationary models predict), this effect is also exponential and can compete 
with the Euclidean action. If one maximises the posterior probability (19.5), 
the most likely value of fig is of order 0.01 for generic slow roll inflationary 
potentials. Such a low value is unacceptable in comparison with the data, 
but it isn’t such a bad miss. As far as I know, this is the first attempt to 
calculate Uq from first principles, and I find it encouraging that it is not so 
far wrong. 

In my view the most questionable component of the theory is the scalar 
field (j), with its inflaton potential. As I discussed above, the identity of (f) is 
still wide open, and it is conceivable that 4> cannot be modelled as a single, 
simple scalar field. There may well be additional fields coupling to (j) which 
affect the prior probability. 

The second possibility (Linde 1988; Vilenkin 1988b) is that the Hartle 
Hawking prescription is incorrect. Linde and Vilenkin have suggested alter- 
nate prescriptions which have the effect of reversing the sign of the action 
for the Euclidean instanton, favouring the universe starting at a high den- 
sity. I do not have space to discuss these here (for a critique see Hawking 
and Turok 1998b) but they appear to be significantly less well defined. In 
particular it is not clear how they are consistent with general coordinate in- 
variance. The appealing thing to me about the Hartle Hawking prescription 
is that it is at least in principle the natural generalisation of the thermal 
ensemble to higher dimensions, based on relativity and quantum mechanics. 
I think we should be reluctant to abandon it without good reason. 

The third possibility is that we must impose more stringent anthropic 
requirements, e.g. that the density of the galaxy we live in should not be 
too great or else planetary systems would be disrupted (see e.g. Tegmark 
and Rees 1998). This effect would also act to increase the predicted Hq. 

Finally, one can question the whole enterprise of using singular instan- 
tons in cosmology. Vilenkin raised an interesting objection. He showed that 
there are yet more general singular instantons which are asymptotically flat 
and would at face value lead to an instability of Minkowski spacetime. On 
these grounds he suggested that all singular instantons should be excluded. 
To see Vilenkin’s instanton, consider solving the field equations (19.1) start- 
ing near the singular point, with b ~ (cTmax — cr) 3 and ~ — log(CTmax — cr). 
For simplicity consider a theory with no potential V ; in general there are 
similar solutions if one arranges for the field to asymptotically tend towards 
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Fig. 17. Vilenkin’s asymptotically flat instanton (left) is compared with the open 
inflationary instanton (right). Both have a singularity at X, and both are cut 
horizontally in two by the surface on which one matches to the Lorentzian region. 

the value for which V vanishes. If V is negligible, we have (j)' oc b~^, and 
the equation for b has a solution b ^ a at large b. The absence of potential 
energy means that instead of closing in on itself as in the open inflation- 
ary instantons, the solution opens out into an asymptotically flat Euclidean 
space. The instanton can be cut on a three surface running horizontally and 
intersecting the singularity at X (see figure) . The continuation to Lorentzian 
spacetime describes an asymptotically flat space with an expanding hole in 
it. One can show that the action for such Euclidean instantons can be 
made arbitrarily small, so on the face of it flat spacetime would be terribly 
unstable to the appearance of holes. 

In order for there to be an instability however, one needs more than an 
instanton solution. One must show that the instanton possesses a negative 
mode. This will produce a Gaussian with a wrong sign in the path integral, 
requiring a distortion of the integration into the complex plane. In this way 
a self energy amplitude can aquire an imaginary piece, which leads to a decay 
rate. I have recently completed a calculation of the fluctuation spectrum 
about Vilenkin’s instantons and shown that when they are properly defined 
as constrained instantons there is in fact no negative mode. This shows 
that flat space is stable to singular instantons and removes this particular 
objection [102]. Nevertheless I would agree that more need to be done to 
establish to what extent one could expect singular instantons to dominate 
the path integral. 

The singularity is an indication that the theory is incomplete, but this is 
in any case no surprise - in any theory of quantum gravity the 
Einstein action will aquire significant corrections at short distances. But 
our ability to predict is not necessarily compromised. I would draw the 
analogy with the hydrogen atom, where the quantum mechanics is perfectly 
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sensible, even though the Coulomb singularity points to the need for a more 
complete theory. In our case, calculation has shown that the perturbations 
are unambiguously defined in the Euclidean region (at least at one loop), 
so there is reason to hope that whatever underlying physics resolves the 
singularity will not affect our predictions. 

There is an intriguing suggestion due to Garriga (1998) that the singu- 
larity may be removed in a higher dimensional theory. Garriga shows that a 
solution to the five dimensional Einstein equations with a cosmological con- 
stant (actually a five sphere) may be written in a “dimensionally reduced” 
form in such a way that the effective four dimensional metric has precisely 
the same sort of singularity that our open inflationary instantons do. The 
singularity is then seen as an artefact of trying to look at a five dimensional 
metric from a four dimensional point of view. Garriga’s model isn’t real- 
istic because it has too much symmetry - the only way to continue a five 
sphere to a Lorentzian spacetime is to continue at an equator, obtaining 
five dimensional de Sitter space. But it is likely that the idea will generalise 
to less symmetric instantons, which may allow a realistic expanding four 
dimensional universe with a frozen extra dimension. 



20 The four form and A 

The recent observations of supernovae at high redshift, when combined with 
observations of the cosmic microwave anisotropy, give evidence in favour of 
a nonzero cosmological constant in today’s universe. It is hard to under- 
stand why the cosmological constant is small: theoretical prejudice has for 
a long time been that there must be some as yet undiscovered symmetry or 
dynamical mechanism that sets it zero. It is even harder to understand why 
the cosmological constant should have value such that it is just beginning to 
dominate the density of the universe today. One possibility is an anthropic 
argument, and this has been pursued by a number of authors (Efstathiou 
1995; Vilenkin 1995; Weinberg 1996; Martel et al. 1997). The anthropic 
argument is particularly powerful here because a cosmological constant has 
such a dramatic effect on the expansion of the universe. Hawking and I have 
recently discussed how the four form field strength in supergravity fits in 
with these ideas, allowing for an anthropic solution of the cosmological con- 
stant problem in which the cosmological constant provides a non-negligible 
contribution to the density of today’s universe. 

The four form field is a natural addition to the field content of the world, 
and is demanded by the simplest candidate field theory of quantum grav- 
ity, eleven dimensional supergravity. Upon dimensional reduction to four 
dimensions, the four form field strength possesses a remarkable property. 
Namely the general solution of the field equations is parametrised by just a 
single, spacetime independent, constant p. This constant contributes to the 




490 



The Primordial Universe 



effective cosmological constant in the four dimensional Einstein equations, 
with the contribution to A being proportional to . If p is appropriately 
chosen, this contribution can cancel other contributions coming from elec- 
troweak symmetry breaking, confinement, chiral symmetry breaking and 
so on. 

So now we have a variable cosmological constant, we have to ask what 
the prior probability distribution is for it. The Hartle Hawking proposal pro- 
vides us with a such a distribution. There is a subtlety in calculating the 
appropriate Euclidean action, which led to a disagreement about the sign of 
the action (Hawking 1984; Duff 1989), which I think is now resolved (Turok 
and Hawking 1998). The point is that the arbitrary constant p mentioned 
above is actually the canonical momentum of a free quantum mechanical 
particle. The eigenvalue of this momentum determines the value of the 
cosmological constant today. When one calculates the wavefunction from 
Feynman’s path integral, one is interested in the amplitude for a particular 
classical universe. The effective cosmological constant in that classical uni- 
verse is specified by the four form momentum. So one needs to calculate 
the path integral for the four form in the momentum representation. 

This gives a probability distribution analogous to that above, where the 
most likely universe is one with an initial net cosmological constant (z.e. 
including V{(j))) equal to zero. Such a universe would never give inflation. 
But as before we can calculate the posterior probability for A and Uq given 
that we live in a galaxy. This requirement is even more stringent upon A 
than it is on Uq - if A was not very tiny in Planck units, the universe would 
have recollapsed, or begun an epoch of inflation, even before the galaxy 
mass scale had crossed the Hubble radius. So to even discuss the existence 
of galaxies A has to be in an extremely narrow band about zero. And across 
this very narrow band, as anticipated by Efstathiou and Weinberg, the prior 
probability distribution for A turns out to be very fiat. Our calculations 
therefore support the assumptions made in their anthropic estimates of A. 
Unfortunately the inferred probability distribution for A is rather broad, 
and with only one possible measurement, the theory seems hard to test. 
Nevertheless if the observations of a nonzero A hold up, the above mecha- 
nism provides one of the few conceivable ways of understanding it. 

21 Conclusions 

The impartial observer of these lectures might conclude that we have gone 
backwards rather than forwards! Electroweak baryogenesis is an appeal- 
ing idea, but it does not work in the minimal theory and there is so far 
no compelling reason to believe in the kind of extensions needed to make 
it work. Cosmic defects have definitely taken a big knock with improve- 
ments in theoretical techniques and most importantly observational data. 




N. Turok: Physics of the Early Universe 



491 



(Although they could well exist but just not play the major role in structure 
formation.) The basic moral of the cosmological instanton approach is that 
at least at first sight most simple inflationary models fail by predicting a 
universe that is too open. 

I consider all of these developments as real progress, in that we have 
formulated cosmological ideas precisely enough that they can be rigorously 
tested. One of the greatest dangers in a field like cosmology where we are 
discussing events that happened in the distant past and for which there is 
only indirect evidence, is self-delusion. I think we should be honest about 
this! Excluding well formulated theories is a respectable way to do science 
and I am excited by the rapid progress we have seen in this respect. 
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Abstract 

This is a series of lectures on M-theory for cosmologists. After sum- 
marizing some of the main properties of M-theory and its dualities 
I show how it can be used to address various fundamental and phe- 
nomenological issues in cosmology. 

1 Introduction 

This is a series of lectures on superstring/M-theory for cosmologists. It is 
definitely not a technical introduction to M-theory and almost all technical 
details will be omitted. A secondary aim of these lectures (or rather the 
lecture notes - for there will probably not be many professional string theo- 
rists at the actual sessions) is to proselytize for a certain point of view about 
M-theory, which is not the conventional wisdom. A crude statement of this 
point of view is that many of the key questions of M-theory can be asked 
only in the cosmological context, in particular the central phenomenological 
question of vacuum selection. I also believe that some of the fundamental 
structure of M-theory, and the relation between quantum mechanics and 
spacetime geometry is obscured when one tries to study only Poincare in- 
variant vacuum states of the theory, and ignore cosmological questions. The 
latter ideas are very speculative however, and I will not discuss them here. 

The classic justification of string theorists for studying states of M-theory 
with d > 4 Poincare invariance, in a world which is evidently cosmological, 
is that the universe we observe is locally approaching a Poincare invariant 
vacuum. Many of the properties of the world should be well approximated 
by those of a Poincare invariant state. It is a philosophy rooted in particle 
physics, and we shall see that it has been quite successful in M-theory as well. 
One of the key features of such states is that they can have supers election 
sectors (a special case of which is the phenomenon of spontaneous breaking 
of global symmetries) . There can be different Poincare invariant states in the 
same theory which “do not communicate with each other” in the following 
sense: certain finite energy excitations of Poincare invariant vacua can be 
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classified as asymptotic states of a number of species of particles. States with 
any finite number of particles differ from the vacuum only in a local vicinity 
of the particles’ asymptotic trajectories (this is more or less the cluster 
property). We can construct the scattering matrix for particle excitations 
of a given vacuum state and it is unitary. No initial multiparticle excitation 
of a given vacuum ever scatters to produce excitations of another^. In 
supersymmetric (SUSY) theories it is quite common to have superselection 
sectors that are not related by a symmetry. A single theory can produce 
many different kinds of physics, which means that it does not make definite 
predictions. 

In the early days of string theory, when the vast vacuum degeneracy of 
string perturbation theory was discovered, it was hoped that nonperturba- 
tive effects would either lift the degeneracy or show us that many of the 
apparent classical ground states were inconsistent (as e.g. an SU{2) gauge 
theory with an odd number of isospin one half fermions is inconsistent). 
From the earliest times there were arguments that this was unlikely to be 
true for those highly supersymmetric ground states which least resemble 
the real world. Recent discoveries in string duality and M-theory make it 
virtually impossible to believe that these archaic hopes will be realized. 

From the very beginning I have argued (mostly in private) that the 
resolution of this degeneracy would only come from the study of cosmology^ . 
That is, the physics that determines the correct Poincare invariant vacuum 
took place in the very early history of the universe. To understand it one 
will have to understand initial conditions, and not just stability criteria for 
possible endpoints of cosmological evolution. Not too much progress has 
been made along these lines, but there are not many people thinking about 
the problem (I myself have probably devoted a total of no more than two 
years since 1984 to this issue). Nonetheless, I hope to convince you that it 
is a promising area of study. 

Associated with the vacuum degeneracy, there are massless excitations. 
I do not have an argument for this which does not depend on an effective 
field theory approximation. In effective field theory, the vacuum degener- 
acy is parametrized by the zero modes of a collection of scalar fields (which 
we will call the moduli fields or simply the moduli^) with no potential. 



have taken pains here not to use arguments from local field theory, which can be 
only approximate in M-theory. 

^The earliest conversation of this type that I remember was with Dan Friedan and 
took place in 1986 or 1987. 

®The term moduli space is used by mathematicians to describe multiparameter families 
of solutions to some mathematical equations or conditions. Thus one speaks of “the 
moduli space of Riemann surfaces of genus g” or “the moduli space of solutions to the 
X equation”. Physicists have adopted this language to describe spaces of degenerate 
ground states of certain supersymmetric theories. We will be making a further abuse of 
the terminology in our discussion of cosmology. 
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Fields like this spell trouble for phenomenology. It is difficult to find ar- 
guments that they couple significantly more weakly than gravity (see how- 
ever [19]), and there is no reason for them to couple universally. Thus, they 
should affect the orbits of the planets and Eotvos-Dicke experiments. 

On the other hand, we know that SUSY is broken in the real world, and 
then there is no reason for scalar fields to remain massless. This however 
does not eliminate all of the problems and opportunities associated with 
the moduli. First of all, one can argue that the potential for the mod- 
uli vanishes in many different, phenomenologically unacceptable, extreme 
regions of moduli space, where supersymmetry is restored. Examples of 
such regions are weakly coupled SUSY string compactifications and regions 
where the world has more than four large dimensions. A quite general ar- 
gument [55] shows that one cannot find a stable minimum of the system by 
any systematic expansion in the small parameters which characterize those 
extreme regions. Either one must accept the possibility of different orders 
in an asymptotic expansion being equally important in a region where the 
expansion parameter is small, or one is led to expect that the moduli vary 
with time on cosmological time scales. The latter option typically leads to 
unacceptable time variation of the constants of nature. Of course, it might 
also provide interesting models of the fashionable “quintessence” [9] , if these 
difficulties can be overcome. 

Even if one finds a stable minimum for the modular potential there 
are still difficulties. These are a consequence of additional assumptions 
about the nature of SUSY breaking. It is usually assumed that SUSY 
has something to do with the solution of the gauge hierarchy problem of 
the standard model. If so, the masses of superpartners of quarks should 
not be more than a few TeV and one can show that this implies that the 
fundamental scale of SUSY breaking cannot be larger than about 10^^ GeV. 
One then finds that the moduli typically have masses and lifetimes which 
are such that the universe is matter dominated at the time nucleosynthesis 
should have been occurring. This is the cosmological moduli problem. There 
have been several solutions proposed for it, which are discussed below. On 
the other hand, there has been much recent interest in models with very low 
scales of SUSY breaking. These include gauge mediated models and models 
with TeV scale Planck/String mass and large extra dimensions"^ Here the 
cosmological moduli problem is more severe, though a recent paper claims 
that it can be solved by thermal inflation [18]. 

Another potential problem with moduli was pointed out by [54]. Since 
the SUSY breaking scale is smaller by orders of magnitude than the natural 



^The latter are often discussed without reference to SUSY, since the hierarchy problem 
is “solved” by the low Planck scale. However, if they are to be embedded into M-theory 
they must have SUSY, broken at the TeV scale. 
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scale of the vacuum energy during inflation (in most models) one must And 
an explanation of the discrepancy. A favored one has been that the true 
vacuum lies fairly deep in an extreme region of moduli space, typically the 
region of weak string coupling. The universe then begins its history at an 
energy density many orders of magnitude larger than the barrier which sep- 
arates the true vacuum from the region of extremely weak coupling where 
time dependent fundamental parameters and unwanted massless particles 
destroy any possibility of describing the world we see. Why doesn’t it “over- 
shoot” the true vacuum and end up in the weak coupling regime? We will 
discuss a cosmology at the end of these lectures that resolves this problem. 

Not all the news is bad. One of the things I hope to convince you of 
in these lectures is that M-theory moduli are the most natural candidates 
in the world for inflaton fields. The suggestion that moduli are inflatons 
was first made in [24]. The word natural is used here more or less in its 
technical held theoretic sense; that large dimensionless constants in an ef- 
fective Lagrangian require some sort of dynamical explanation. For moduli, 
in order to get n e-foldings of slow roll inflation one needs dimensionless 
parameters of order 1/n in the Lagrangian. Another interesting point is 
that with moduli as inflatons, the scale of the vacuum energy that is re- 
quired to explain the amplitude of primordial density fluctuations is the 
same as the most favored value of the unification scale for couplings and 
close to the scale determining the dimension five operator that gives rise to 
neutrino masses. These numbers fit best into an M-theory picture similar 
in gross detail to that first proposed by Witten [44] in the context of the 
Hof ava- Witten [43] description of strongly coupled heterotic strings. In such 
scenarios, 10^® GeV is the fundamental length scale and the fields of the 
standard model live on a domain wall in an eleven dimensional space with 
7 compact dimensions of volume ~ 10^ fundamental units. The four dimen- 
sional Planck scale is an artifact of the large volume. The SUSY breaking 
vacuum energy responsible for inflation must also come from effects confined 
to a (perhaps different) domain wall. 

To summarize, M-theory has a number of features which require cosmo- 
logical explanations and a number of potentially interesting implications for 
cosmologists. The two subdisciplines have very different cultures, but they 
ought to see more of each other. The plan of these lectures is as follows. 
I will first introduce the elements of string duality and M-theory, starting 
from the viewpoint of IID SUGRA, which involves the smallest number of 
new concepts for cosmologists. The key ideas will be the introduction of the 
basic half SUSY preserving branes of IID SUGRA and the demonstration 
of how various string theories arise as limits of compactifled versions of the 
theory. We will see that the geometry and even the topology of space as 
seen by low energy observers can change drastically in the course of making 
smooth changes of parameters. Another key concept is that of the moduli 
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space of vacua which preserve a certain amount of SUSY, and the vari- 
ous kinds of nonrenormalization theorems which allow one to make exact 
statements about the properties of these spaces. 

From this we will turn to a discussion of the fundamentals of quantum 
cosmology. This discussion will be incomplete since the material is still un- 
der development. We will review the problem of time in quantum cosmology 
and a standard resolution of it based on naive semiclassical quantization of 
the Wheeler-DeWitt equation. We will argue that M-theory promises to put 
this argument on a more reliable basis, and in particular that the peculiar 
Lorentzian metric on the space of fields, which is the basis for the success 
of the Wheeler-DeWitt approach to the problem of time, can be derived 
directly from the duality group of M-theory (at least in those highly super- 
symmetric situations where the group is known). This leads naturally into 
a discussion of whether cosmological singularities can be mapped to nonsin- 
gular situations via duality transformations (it turns out that some can and 
some can’t and that the distinction between them defines a natural arrow 
of time). We also present a weak anthropic argument which attempts to 
answer the question of why the world we see is not a highly supersymmetric 
stable vacuum state of M-theory. Finally, as an amusement for aficionados 
of heterodoxy, we present some suggestions for M-theoretic resolutions of 
certain cosmological conundra without the aid of inflation. 

The remainder of the lectures will be devoted to more or less standard 
inflationary models based on moduli and will examine in detail the proper- 
ties of these models adumbrated above. Compared to much of the inflation 
literature, these sections will be long on general properties and short on 
specific models which can be compared to the data. M-theory purports to 
be a fundamental theory of the universe, rather than a phenomenological 
model. Inflaton potentials are objects to be calculated from first principles 
rather than postulated in order to fit the data. There is nothing wrong with 
phenomenological models of inflation, but they are not the real province 
of M-theory cosmology. Unfortunately, our understanding of the nonper- 
turbative properties of M-theory in the regime where the supersymmetry 
algebra is sufficiently small to allow for a potential on the moduli space (the 
alert reader may have already noted that the preceding phrase contains an 
oxymoron) is too limited to allow us to build reliable models of the potential. 
Thus, if we are honest, we must content ourselves with general observations 
and the definition of a set of goals. 

2 M-theory, branes, moduli and all that 

2.1 The story of M 

Once upon a time there were six string theories. Well, actually there were 
five (because one, the Type Ia theory, was an ugly duckling without enough 
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Lorentz invariance) and actually there were an infinite number, or rather 
continuous families... What’s going on here? The basic point is the fol- 
lowing: what string theorists called a string theory in the old days was 
a set of rules for doing perturbation theory. What was perhaps mislead- 
ing to many people is that these rules were usually given in terms of a 
Lagrangian, more generally a superconformally invariant 1-1-1 dimensional 
quantum field theory, (with some extra properties) . We are used to think of 
Lagrangians as defining theories. The better way to think of the world sheet 
Lagrangians of string theory is by imagining a quantum field theory with 
many classical vacua. Around each vacuum state we can construct a loop 
expansion. The quadratic terms in the expansion around a vacuum state 
define a bunch of differential operators, whose Green’s functions are the 
building blocks of the perturbation expansion. Using Schwinger’s proper 
time techniques we can describe these Green’s functions in terms of an aux- 
iliary quantum mechanics, and if we wish we can describe this quantum 
mechanics in terms of a Feynman path integral with a Lagrangian. The 
world sheet path integrals of string theory are the analogs of these proper 
time path integrals. One of the beautiful properties of string theory is that, 
unlike field theory, the Lagrangian formulation of the propagator completely 
determines the perturbation expansion. To compute an n particle scatter- 
ing amplitude in tree level string theory one does the path integral on a 
Riemann surface with no handles and (for theories whose perturbation ex- 
pansion contains only closed strings) n boundaries. The boundary condi- 
tions on the boundaries are required to be superconformally invariant and 
carry fixed spacetime momentum. The Lagrangian itself is superconformally 
invariant, and the allowed boundary conditions are generated by acting on 
a particular boundary condition which defines the ground state of the single 
string with a set of vertex operators which represent small perturbations of 
the action which preserve superconformal invariance®. A given vertex oper- 
ator creates a state of the string which propagates as a particle with given 
mass and quantum numbers. The higher orders in perturbation theory just 
correspond to computing the same path integral on Riemann surfaces of 
higher genus. One sums over all Riemann surfaces, or in some cases only 
over oriented ones. 

The conditions of superconformal invariance have many solutions. Glas- 
sically (in the sense of two dimensional classical field theory - this should 
not be confused with tree level string theory which corresponds to summing 
all orders in the semiclassical expansion of the world sheet field theory, on 
Riemann surfaces with no handles), for the particular case of Type II string 
theories, the bosonic terms in the most general superconformal Lagrangian 



®Actually the situation is a bit more complicated, and to do it justice one must use 
the BRST formalism. For our purposes we can ignore this technicality. 
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have the form 



C = {G^u{.x) + iB ^^{x))dx^ dx'' + h.c. + (2.1) 

where the derivatives are taken with respect to complex coordinates on 
a Euclidean world sheet. Gf^, is symmetric and is antisymmetric. 
X is the Euler density of the world sheet, a closed two form (for more on 
forms, closed and otherwise, see below) whose integral is the Euler Char- 
acter. Quantum mechanically, there are restrictions on the functions, G,B, 
and <&. To lowest order in the world sheet loop expansion the condition of 
superconformal invariance coincides with the equations of motion coming 
from a spacetime Lagrangian 

At = y^e-2^[i? + 4(V$)2 + (dB)2]. (2.2) 

This fact, combined with the fact that vertex operators are allowed per- 
turbations of these equations, shows us that string theory is a theory of 
gravity. One cannot choose the background metric arbitrarily; it must sat- 
isfy an equation of motion. 

In classifying consistent solutions of all these rules, string theorists found 
a number of discrete choices depending on the number of fermionic genera- 
tors in the world sheet superconformal group and on the types of Riemann 
surfaces allowed. This led to the five different types of string theory. Once 
these discrete choices were made, there still seemed to be a multiparameter 
infinity of choices. However, it was soon understood that the continuous 
infinity corresponded to expanding the same basic theory around different 
solutions of its classical equations of motion (approximately the equations 
generated by Eq. (2.2)). This was a little surprising, because one of the 
rules for the perturbation expansion was that there be some number of flat 
Poincare invariant dimensions (I explained in the introduction why string 
theorists insisted on this). So each of these solutions was a static classical 
vacuum state. Why are there so many vacua? The answer is spacetime 
SUSY. Indeed almost every known perturbatively stable vacuum state of 
string theory is supersymmetric®. It is well known that spacetime SUSY 
often leads to nonrenormalization theorems which prevent the existence of 
potentials for scalar fields. The strongest theorems of this type come when 
there is enough SUSY to guarantee that the scalars are in supermultiplets 
with gauge or gravitational fields, but there are other examples. As noted 
in the introduction, we will call the space of classical vacua the moduli 
space. It should be noted that the moduli space is not connected {e.g. the 



® There are no known exceptions. Recently however [8] some constructions which 
appear to be stable at least through two loops have been found. This is the reason for 
the word almost in the text. 
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branches with different amounts of SUSY are generally disconnected from 
each other), nor is it a manifold. The reason for the latter property is 
that often, new massless states can appear on submanifolds of the moduli 
space. Often these include scalars, which, as long as the original moduli 
are restricted to the submanifold, have no potential. One can then define a 
new branch of moduli space on which the original moduli are restricted to 
the submanifold, but the new massless scalar fields have expectation values. 
Thus the moduli space has several disconnected components, each of which 
is a bunch of manifolds of different dimension, glued together along singular 
submanifolds. 

Thus, circa 1994-95 we had five discrete classes of string theory. 
Type IIa,b; HeteroticA.B (A refers to the Ug x Es heterotic string the- 
ory and B to the SO{32) version) and Type Ib. The Type Ib theory has the 
same symmetries in spacetime as the HetB theory although the world sheet 
theories are completely different. In Type Ib the gauge quantum numbers 
are carried on the ends of open strings (like flavor quantum numbers on 
QCD strings) and nonorientable world sheets appear in the perturbation 
expansion. The heterotic theory has only closed strings, orientable world 
sheets and gauge quantum numbers carried by the body of the string. There 
is also a Type Ia theory which has a similar relation to HetA. This theory 
does not have ten dimensional Lorentz invariance because it has two 8-1-1 
dimensional domain walls at the ends of a finite or infinite 9-1-1 dimensional 
“interval”. It has S'0(16) x S'0(16) gauge symmetry carried by the ends of 
open strings which can only propagate on the domain walls. 

The labels A and B refer to theories which are different in 10 (the max- 
imal dimension for perturbative strings) dimensions but are actually equiv- 
alent to each other when compactified on a circle. The equivalence is due 
to a stringy symmetry called T duality. The momentum of a string on a 
circle is the integral of the time derivative of its coordinate; i.e. the time 
derivative of the center of mass position. 

P = J dadt6. (2.3) 

Strings on a circle carry another quantum number called winding number, 
which is defined by 

w = y dadaO. (2.4) 

The Euclidean world sheet Lagrangian for the string coordinate 9 is 

= {dt9f + (2.5) 

Instead of 9 we can introduce a new coordinate by the two dimensional 
analog of an electromagnetic duality transformation 



da9 = eabdb9. 



( 2 . 6 ) 
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It turns out that when one performs this transformation one automatically 
takes a Type A theory to a Type B theory. 

We learn two things from this: first, there are only half as many differ- 
ent string theories as we thought, and second, to see that theories are the 
same we may have to compactify them. Decompactification loses impor- 
tant degrees of freedom (in this case string winding modes, which go off to 
infinite energy) which are necessary to see the equivalence. There are no 
more such equivalences which can be seen in perturbation theory, but we 
might begin to suspect that there are further equivalences which might only 
appear nonperturbatively. How can we hope to realize this possibility in a 
theory which is formulated only as a perturbation series? 

The key to answering this question is the notion of SUSY preserving or 
BPS states. To explain what these are, let me introduce the SUSY algebra 

{Qa,Qb} = l^,P^.- (2.7) 

Actually, this is only the simplest SUSY algebra one can have in a given 
dimension. We will see more complicated ones in a moment. If we look at 
particle representations of the SUSY algebra, then is either a timelike 
or a lightlike vector. In the timelike case, the matrix on the right hand side 
of (2.7) is nondegenerate, while in the lightlike case it is degenerate - half 
of the states in the representation are annihilated by it. This means that 
in the lightlike case half of the SUSY generators annihilate every state in 
the representation. Thus massless supermultiplets are smaller than massive 
ones. This means, that in general in a supersymmetric theory, small changes 
in the parameters will not give mass to massless particles. In order to do so 
one must have a number of massless multiplets which fit together to form a 
larger massive multiplet (the super Higgs mechanism) in order for states to 
be lifted. If this is not the case for some values of the parameters, then small 
changes cannot make it so and the massless particles remain. Of course, we 
did not really need SUSY to come to this conclusion for massless particles 
of high enough spin. In that case it is already true that the Lorentz group 
representations of massless and massive particles are of different multiplicity. 

The new feature really comes if we compactify the theory in a way which 
preserves all the SUSY generators. This can be done by compactifying on 
a torus with appropriate boundary conditions. The SUSY algebra remains 
the same, but now some of the components of the momentum are discrete. 
Also, the Lorentz group is broken to the Lorentz group of the noncompact 
dimensions, so the spinor representation breaks up into some number of 
copies of the lower dimensional spinor. The algebra now looks like 

= + ( 2 . 8 ) 

Spinor and vector indices now run over their lower dimensional values, and 
i,j label the different copies of the lower dimensional spinor in the higher 




506 



The Primordial Universe 



dimensional one. The generators are scalars under the lower dimension 
Lorentz group. They are combinations of the toroidal momenta and are 
examples of what are called central charges. 

Now consider a state carrying nonzero values of the central charge in such 
a way that the higher dimensional momentum is lightlike. It represents a 
massive Kaluza-Klein mode of the massless particle in the higher dimension. 
In a nonsupersymmetric theory the masses of Kaluza Klein modes of higher 
dimensional massless fields are renormalized by quantum corrections. But 
in a theory with the extended SUSY algebra (2.8) we can ask whether the 
representation is annihilated by half the supercharges (other fractions are 
possible as well). If it is, we get a computation of the particle’s nonzero mass 
in terms of its central charges. This mass cannot change as parameters of 
the theory are varied in such a way as to preserve extended SUSY, or rather 
the formula for its variation with parameters may he read directly from the 
SUSY algebra. We will see below that it is possible to realize the central 
charges ZU of the extended lower dimensional SUSY algebra in other ways. 
Instead of representing KK momenta in a toroidal compactification, they 
might arise as winding numbers of extended objects, called branes, around 
the compact manifold. The italicized conclusion will be valid for these states 
as well. 

The argument for the statement in italics above, is again based on the 
smaller dimension of the representation. To see it more explicitly, work in 
the frame where the spatial momentum is zero, and take the expectation 
value of the anticommutator in states of a single particle with mass M 
(actually we mean a whole SUSY multiplet of particles). Then (2.8) reads 

MS., + i^Z^^ = {[Q\, QIU) > 0. (2.9) 

The last inequality follows because we are taking the expectation value of 
a positive operator. It says that the mass is bounded from below by the 
square root of the sum of the squares of the eigenvalues of the matrix Z, 
which are also called the central charges. 

Equality is achieved only when the expectation value vanishes, which, 
since the SUSY charges are Hermitian, means that some of the charges anni- 
hilate every state in the representation. These special representations of the 
algebra have smaller dimension and cannot change into a generic represen- 
tation, which satisfies the strict inequality, as parameters are continuously 
varied. 

Thus, in theories with extended SUSY, certain masses can be calculated 
exactly from the SUSY algebra. These special states are called Bogolmony 
Prasad Sommerfield or BPS states, since these authors first encountered 
this phenomenon in their classical studies of solitons [10]. The connection 
to SUSY, which makes the classical calculations into exact quantum state- 
ments, was noticed by Olive and Witten [12]. 
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Fig. 1. A two toms with nontrivial cycles labelled. 



Notice that although we motivated this argument in terms of Kaluza- 
Klein states, it depends mathematically only on the structure of the ex- 
tended SUSY algebra. Thus if we can obtain this algebra in another way, 
we will still have BPS states. An alternative origin for central charges and 
BPS states comes from “wrapped branes” of a higher dimensional theory. 

To understand this notion, note that, strictly from the point of view of 
Lorentz invariance, the SUSY algebra could contain terms like 



{Qa,Qb} = ( 2 . 10 ) 



The multiple indices are antisymmetrized. The famous Haag et al. [11] 
generalization of the Coleman Mandula theorem, tells us that this pth rank 
antisymmetric tensor charge, must vanish on all finite energy particle states. 
On the other hand, the purely spatial components of it have precisely the 
right Lorentz properties to count the number of infinite energy p-branes, or 
p-dimensional domain walls, oriented in a given hyperplane. We will have 
more to say about these brane charges when we talk about branes and gauge 
theories below. 

Now suppose we have compactified p or more dimensions, and the result- 
ing compact space has a topologically nontrivial p-dimensional submanifold, 
or p-cycle. To see what we mean, consider the two torus. 

It has two different kinds of nontrivial 1 cycle, labelled a and b in the 
figure. The whole torus is a nontrivial two cycle. The word nontrivial cycle 
or just cycle implies that the submanifold cannot be contracted to a point 
“because it wraps around a hole in the manifold” . If we wrap a p-brane 
around the p-cycle, we get a finite energy particle state. The tensor charge 
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with all indices pointing in the compact directions is a scalar charge in 
the remaining noncompact directions and is allowed to appear as a central 
charge in an extended SUSY algebra. Often, the corresponding particles 
have the BPS property. 

With this background, we can get on with the story of M. Practitioners 
of string duality realized that BPS states gave them a powerful handle on 
nonperturbative physics. For example, consider a weakly coupled string 
theory and a solitonic state, whose mass in string units is proportional to 
If it is a BPS state then as the coupling becomes infinitely strong, it 
becomes infinitely lighter than the string scale (if not for the BPS property, 
we could not trust the weak coupling formula at strong coupling) . In all the 
cases which have been studied one can, by thinking about the lightest BPS 
states in the strong coupling limit, realize that they are just the elementary 
states of another weakly coupled theory. In most cases, this is another string 
theory, but there is a famous exception. 

If one considers Type IIa string theory in ten dimensions, it contains a 
single U{1) gauge field. None of the perturbative states are charged under 
this field; they have only magnetic moment couplings. If one considers 
hypothetical BPS states charged under this C/(l), then it is easy to show 
that their spectrum in the strong coupling limit is precisely that of the 
supergravity multiplet in 11 dimensions. Thus one is led to conjecture [35] 
that the strong coupling limit of Type IIa string theory has eleven flat 
dimensions and a low energy limit described by SUGRA. 

None of this was much of a surprise to the SUGRAistas [27]. It had 
long been known that the low energy limit of IIa string theory was a ten 
dimensional SUGRA theory which was the dimensional reduction of IID 
SUGRA, with the string coupling appearing as the ratio of the three halfs 
power of the radius of the reducing circle to the eleven dimensional Planck 
mass. The SUGRAistas even had a correct explanation of where the strings 
come from. As we will see IID SUGRA couples naturally to a membrane, 
the M2 brane. If we wrap one leg of the M2 brane around the circle whose 
radius is being shrunk to zero we get a string whose tension is going to zero 
in Planck units. 

String theorists get a G for closed mindedness for ignoring the message 
of the SUGRAistas for so long. Behind their resistance lay the feeling that 
because both IID SUGRA and the world volume theory of membranes 
are nonrenormalizable, one could not trust conclusions drawn on the basis 
of these theories. It was only with the advent of an unambiguous, string 
theoretic construction of the KK gravitons of IID SUGRA as bound states 



^In string theory both r = 1,2 are realized, r = 2 corresponds to a conventional 
soliton, arising as a solution of the classical equations of motion, r = 1 corresponds to 
Dirichlet brane or D brane states. 
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Fig. 2. A cartoon of the moduli space of M-theory. 



of DO branes [28, 29] that the last bastions of resistance fell. What one 
should have realized from the beginning was that conclusions about BPS 
states, based as they are only on the symmetry structure of the theory, 
can be extrapolated from effective theories far beyond the limited range of 
validity these low energy approximations. 

The picture as we understand it today® is illustrated by the famous 
“modular deerskin” . 

There is a single theory, which we now call M-theory® which has a 
large moduli space. All of the known perturbative string theories, and 
IID SUGRA are limits of this theory in certain extreme regions, or bound- 
aries, of moduli space (the cusps in the picture). There is another class of 
limits, called F-theory, which are not amenable to complete analysis, but 
about which many nontrivial statements can be made. An example of an 
F-theory region is the strongly coupled heterotic string compactified on a 
two torus. 

One of the lessons of duality is that no one of these regions is a priori 
better than any other. Each of them tells a partial story about M-theory, 
and we learn a lot by trying to patch these stories together. However, the 
IID SUGRA limit has a distinct advantage when one is trying to explain M- 
theory to non-string theorists, particularly if they have a good background 
in GR. In this limit, most of the arguments are completely geometrical and 



®Or rather a cartoon of it, for moduli space is much more complicated than a two 
dimensional deerskin. 

®Or at least some of us do. Some people reserve the name M-theory for the region of 
moduli space where IID SUGRA is a good approximation. I consider that a waste of a 
good name since we can call this the IID SUGRA region. 
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can be understood on the basis of classical field theory and the classical 
Lagrangians for various extended objects. For this reason, I will begin in 
the next section with a discussion of the IID SUGRA Lagrangian^°. 

3 Eleven-dimensional supergravity 

In eleven dimensions, the graviton has 44 spin states transforming in the 
symmetric traceless tensor representation of the transverse (transverse to 
the graviton’s lightlike momentum) SO{9) rotation group. The gravitino 
is a tensor-spinor of this group, satisfying the constraint = 0 which 

leaves 128 components. The remaining 84 = bosonic states in the 

SUGRA multiplet transform as a third rank antisymmetric tensor. 

The covariant Lagrangian for the bosonic fields in the multiplet is 
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(3.1) 

The supersymmetry transformation of the gravitino is 

(3.2) 

The existence of a three form gauge potential, suggests that the theory may 
contain a membrane, which couples to the three form via 

r (9 t^ 

Q2 j dCX^dr (3.3) 

where the are coordinates on the membrane world volume. The dual of 
the four form field strength is a seven form Gy, whose source, defined 

by d * Gy(= ^dGi) = is a six form current. This is a current of five 
dimensional objects, which we will call M5 branes. 

The low energy SUGRA approximation to M-theory gives us evidence 
that both M2 and M5 branes exist, since there are soliton solutions of the 
SUGRA equations of motion with the requisite properties. This would not 
be a terribly convincing argument, since the scale of variation of the fields 
of these objects is (what else?) the eleven dimensional Planck scale, and 



All.. .1111(^1 (-1 

3456^ 



least its purely bosonic part. Fermions, like virtue in the world of politics, are 
entities often talked about, but rarely seen, in discussions of SUSY theories. 

^^The large number of spacetime dimensions lead me to resort to differential form 
notation even for an audience of cosmologists. 
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SUGRA is only an effective field theory. However, these solitons have the 
BPS property. That is, we can find them by insisting that half of the 
gravitino SUSY variations vanish. This leads to first order equations, which 
are much easier to solve than the full second order equations, but give a 
subclass of solutions of the latter. Since these solutions are constructed so 
that half of the SUSY variations vanish, their Poisson brackets with half the 
SUSY generators (in a canonical formulation of IID SUGRA) vanish. This 
is the classical approximation to the statement that the quantum states 
represented by these soliton solutions are annihilated by half of the SUSY 
generators. 

The solutions are 



ds^ - (^1 + — ^ {-df + dCT^ + dp2) + (^1+ 



1/3 

^ 1 dxo 



A 11 ^ 
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ds^=(l + ^j (-dt^ + dx^)+(^l + ^j dy; 






(3.4) 

(3.5) 

(3.6) 

(3.7) 

(3.8) 



for the two brane and five brane respectively. 

In each of these equations, r denotes the transverse distance from the 
brane. These solutions contain arbitrary parameters T 2 and which con- 
trol the strength of the coupling of these objects to the three form gauge 
potential and to gravity. However, an elementary argument leads to a de- 
termination of these parameters. Compactify the theory on a seven torus 
and wrap the M2 brane around two of the dimensions of this torus and the 
M5 brane around the other five. The integral of the three form over the 
two torus on which the membrane is wrapped give us an ordinary Maxwell 
(1 form) potential. It is easy to see that the wrapped membrane is a charged 
particle with charge Q 2 with respect to this Maxwell field (the membrane 
coupling of (3.3) dimensionally reduces to the standard Maxwelll coupling 
to a charged particle). It is a little harder to see that the wrapped M5 
brane is a magnetic monopole for this field. Thus, the Dirac quantization 
condition implies 

2ttQ2Q5 e Z. (3.9) 



This is a quick and dirty proof of the Nepomechie-Teitelboim [30] general- 
ization of the Dirac quantization condition to p-form gauge fields. This con- 
dition determines the tension of the minimally charged M2 and M5 branes 
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to be: 

Ts = (3.10) 

T2 = (3.11) 

Given the existence of these infinite flat branes, we can also study small fluc- 
tuations of them which describe (slightly) curved branes moving in space- 
time. The most useful way to do this is to introduce a world volume held 
theory which contains the relevant fluctuations. Among the variables of 
such a theory should be a set of scalar fields which describe small fluctua- 
tions of the brane in directions transverse to itself. It turns out that these 
world volume theories are, in the present case, completely determined by 
SUSY. 

Let us begin with the M2 brane. The SUSYs that preserve the brane 
satisfy 7^ . . .7^° Q = There are 16 solutions to this equation, which 
transform as 8 spinors under the SO{2, 1) Lorentz group of the brane world 
volume. Each two component world volume spinor transforms in the eight 
dimensional spinor representation of the transverse SO{8) rotation group. 
From the point of view of the world volume held theory, the latter is an 
internal symmetry. We expect the world volume theory to contain 8 scalar 
fields, representing the transverse fluctuations of the membrane. A SUSY 
Lagrangian containing these fields is given by 

Cm 2 = daX^daX^ + 0JT“9,0J. (3.12) 

where T“ are three world volume Dirac matrices. The SUSY generators are: 

Qi = J + K^dfdax^]- (3.13) 

Using the canonical commutation relations for the world volume fields it 
is straightforward to verify that these satisfy the SUSY algebra. Here we 
have used 7* to represent the eight dimensional Dirac matrices, despite 
the possibility of confusion with the eleven dimensional matrices of the 
paragraphs above and below^^. 

The world volume theory of the M5 brane is more interesting. The 
SUSYs preserved satisfy 7® . . . 7^° Q = Q (using the same argument as in the 
footnote above). The world volume Lorentz group SO(5, 1) has two different 



see this, note that the condition specifying which charges annihilate the membrane 
state must be linear (the sum of two such charges is another) and invariant under both 
the transverse rotation and world volume Lorentz groups. This is the only such condition 
since the product of all the IID Dirac matrices is 1. 

^®We have also passed over in silence the two different types of eight dimensional spinor 
which appear in these equations. Experts will understand and amateurs would only be 
confused by this detail. 
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chiralities of spinor representation, and (ultimately because the product of 
all eleven dimensional Dirac matrices is 1) this condition says that all the 
SUSY generators have the same chirality. There are sixteen real solutions of 
these constraints which can be arranged as two complex 4 representations 
of the world volume Lorentz group. Under the transverse SO(5) rotation 
group, they transform as four copies of the fundamental pseudoreal spinor. 
This kind of SUSY is called (2, 0) SUSY in six dimensions. 

The coordinates of transverse fluctuations are five scalar world volume 
fields, which transform as the vector of SO{5). There is a unique SUSY 
representation which includes these fields. Their superpartners are two 
fermions in the 4 representation of the Lorentz group, and a second rank 
antisymmetric tensor gauge field, whose field strength satisfies the self 
duality condition Indeed, in a physical light cone 

gauge, Bab (the A, B indices indicate the four transverse dimensions in 
the lightcone frame inside the world volume) has 6 components and the self 
duality cuts this down to 3. Combined with the five scalars this makes eight 
bosonic degrees of freedom which balance the eight degrees of freedom of 
a Weyl fermion. The self dual antisymmetric tensor field is chiral and its 
field equations cannot be derived from a covariant Lagrangian (without a 
lot of extra complications and gauge symmetries). As we will see, this is 
the origin of the world sheet chirality of the heterotic string. 

4 Forms, branes and BPS states 

4.1 Differential forms and topologically nontrivial cycles 

Before proceeding with our discussion of compactification of IID SUGRA, 
and its relation to string theory, I want to insert a short remedial course 
on the mathematics of differential forms. We have already used this above, 
and will use it extensively in the sequel. Differential n forms or totally 
antisymmetric covariant tensor fields, were invented by mathematicians as 
objects which can be integrated over n dimensional submanifolds of a man- 
ifold of dimension d. The basic idea is that at each point, such a form picks 
out n linearly independent tangent vectors to the manifold and assigns a 
volume to the corresponding region on the submanifold. If are the tan- 
gent vectors, then is the volume element. 

Mathematicians introduced Grassmann variables da;'' as placeholders for 
the n independent tangent vectors Thus, an n form becomes . . . 

da;''" , which is a commuting or anticommuting element of the Grassman 
algebra according to whether n is even or odd. In this way, the set of all 
forms of rank 0 — > d is turned into an algebra. A derivative operator d is 
defined on this algebra by d = da;'‘g|^. This definition is independent of 
the metric or affine connection on the manifold. Note that d^ = 0. Forms 
satisfying dw = 0 are called closed. Trivial solutions of the form w = dA 
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are called exact, and the set of equivalence classes of non exact closed forms 
(modulo addition of an exact form) is called the cohomology of the manifold. 

If a submanifold is parametrized as a mapping from some n di- 

mensional parameter space into the manifold, then the integral of a form 
over the submanifold is given by 





Q^ai ■ ■ ■ 






(4.1) 



If a; is an n — 1 form and S' an n dimensional submanifold with boundary dS, 
then the generalized Stokes theorem says that jg duj = uj. In particular, 
the integral of an exact form over a submanifold without boundary, vanishes. 

Most n dimensional submanifolds without boundary, are themselves 
boundaries of n -I- 1 dimensional submanifolds. However, in topologically 
nontrivial situations there can be exceptions, called nontrivial n-cycles. You 
can see this in the example of the a or 6 cycle in Figure 1. Generally there 
are many such nontrivial cycles if there are any, but they often differ by 
trivial cycles (think of two different circles which go around the circumfer- 
ence of the torus). Again, the n cycle is considered to be the equivalence 
class of nontrivial submanifolds modulo trivial ones. 

One of the most important theorems in mathematics is the de Rham 
theorem, which states that there is a one to one correspondence between 
the cohomology of a manifold and the independent nontrivial closed cycles. 
That is, one can choose a basis Ui in the space of closed modulo exact n 
forms such that /(g.oji = Sij. Here Cj are the independent nontrivial n 
cycles. 

So much for pure mathematics. The reason that all of this math is inter- 
esting in M-theory is that the theory contains dynamical extended objects 
called p-branes, and the theory of differential forms allows us to understand 
the most important low energy dynamical properties of these objects as a 
beautiful generalization of Maxwell’s electrodynamics. In addition this leads 
us to a new and deep mechanism for generating nonabelian gauge groups, 
which is connected to the theory of singularities of smooth manifolds. This 
in turn allows us to obtain an understanding of certain spacetime singulari- 
ties in terms of Wilson’s ideas about singularities of the free energy at second 
order phase transitions. It was long known that the free energy of statisti- 
cal systems at second order phase transitions had singularities as a function 
of the temperature and other thermodynamic variables. Wilson realized 
that these singularities could be understood using the equivalence between 
statistical mechanics and Euclidean quantum field theory. At values of the 
parameters corresponding to phase transitions, massless particles appear in 
the field theory and the singularities of the free energy are attributed to 
infrared divergences coming from integrating over the fluctuations of these 
particles. 
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In classical geometry, singularities of manifolds can be classified by ask- 
ing which nontrivial cycles shrink to zero as parameters are varied in such a 
way that a smooth manifold becomes singular. In M-theory there are states 
described by BPS branes wrapped around these nontrivial cycles, which be- 
come massless when the cycles shrink to zero. The singularities in classical 
geometry are then understood to be a reflection of the quantum fluctuations 
of these massless particles. That is, singular quantities in classical geometry 
can be calculated in terms of Feynman diagrams with loops of the massless 
states that M-theory predicts at these special points in moduli space (only 
these states contribute to the infrared divergence). The quantum theory 
itself is nonsingular at these points, but its description in terms of classi- 
cal geometry breaks down because there are light degrees of freedom (the 
wrapped branes) other than the gravitational held. Branes and singularities 
are at the heart of string duality. 

Let us begin the discussion of branes by recalling the Lagrangian for the 
coupling of the electromagnetic held to a charged particle. It is 

I (4.2) 

where in the second equality we have used the notation of forms. If the 
particle path is closed, this action is invariant under gauge transformations 
Ai ^ Ai + dAo. If we add to the action the simplest gauge invariant 
functional of the Ai held / dAi we obtain Maxwell’s theory of the 

coupling of charged particles to electromagnetism. 

There is an obvious generalization of all of this to the coupling of a p -I- 1 
form potential to a p-brane. The interaction is given by 

/ (4.3) 

-'Cp+i 

where the integral is over the world volume of the p-brane. By the gener- 
alized Stokes theorem, this enjoys a gauge invariance Ap^i Ap+i + dAp. 
By virtue of the fundamental equation = 0, Fp+2 = dAp+i is a gauge 
invariant object and we can write an immediate generalization of Maxwell’s 
action, 

J d^x *Fp+2AFp+2. (4.4) 

Once we have normalized A by writing the free Maxwell 
Lagrangian, we are left with a free coefficient in the coupling of the brane to 
the gauge held. In the electromagnetic case, we know that this coefficient, 

^^Here the * denotes the Hodge dual, but for the purposes of this lecture we can just 
think of this as a shorthand for Maxwell’s action. 
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the electric charge, is in fact quantized if we introduce magnetic monopoles 
and quantum mechanics, an observation first made by Dirac. 

The analogous observation for general p-branes was made by 
Nepomechie and Teitelboim [30] . A p-brane couples to a rank p + 2 field 
strength. In d spacetime dimensions we can introduce, given a metric, a 
dual field strength 



(4.5) 

where we have raised indices with the metric tensor. One thus sees that the 
natural dual object to a p-brane is a d — p — 4 brane. 

The easiest way to see the Dirac-Nepomechie-Teitelboim condition is to 
compactify the system on a torus of dimension d — 4. We wrap the p-brane 
around p-cycles of this torus and its dual around the remaining d— p— 4. The 
integral, J]pp Ap+i defines a one form or Maxwell field in the uncompacti- 
fied four dimensional spacetime, and the p-brane is an electrically charged 
particle. It is easy to convince oneself that the wrapped dual brane is a 
monopole. Thus we obtain a quantization condition relating the couplings 
of the two dual branes to the p -I- 1 form gauge potential. Nepomechie and 
Teitelboim show that there are no further consistency conditions. 

4.2 SUSY algebras and BPS states 

Now let us recall what we learned in the previous section about BPS states. 
I will repeat that material briefly here, but readers who feel they have 
absorbed it adequately can skip the first few paragraphs. We pointed out 
above that many SUSY theories have classes of massive states whose masses 
are protected from renormalization in the same way that those of massless 
particles of spin greater than or equal to one half are. These are called 
Bogolmony-Prasad-Sommerfield or BPS states. The easiest to understand 
are the Kaluza Klein states of toroidal compactification, but there is a vast 
generalization of this idea. The theorem of Haag Lopuzanski and Sohnius 
(a generalization of the Coleman-Mandula theorem) [11] shows that the 
ordinary SUSY algebra is the most general algebra compatible with an 
S matrix for particle states. Purely algebraically though, the right hand 
side of the SUSY algebra could have contained higher rank antisymmetric 
tensor charges (the general representation appearing in the product of two 
spinors) . 

Our discussion of branes and gauge fields provides us with a natural 
source of such charges, as well as showing us the loophole in the HLS the- 
orem. Indeed, following our analogy with Maxwell electrodynamics, it is 
easy to see that an infinite, flat, static p-brane carries a conserved rank 
p-antisymmetric tensor charge as a consequence of the equations of motion 
of the p -I- 1 form gauge field it couples to. The fact that these branes are 
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infinite extended objects and carry infinite energy is the loophole in the 
theorem. It referred only to finite energy particle states. All of the tensor 
charges vanish on finite energy states. 

However, when we compactify a theory, we can imagine wrapping one 
of these p-branes around a nontrivial p-cycle in the compact manifold. The 
resulting state propagates as a particle in the noncompact dimensions. It has 
finite energy, proportional to the volume of the cycle it was wrapped around. 
Its tensor charge becomes a scalar in the noncompact dimensions and is 
called a central charge of an extended {i.e. larger than the minimal algebra 
in the noncompact dimensions) SUSY algebra. Thus the central charges in 
extended SUSY algebras in low dimensions may come from wrapped brane 
charges as well as KK momenta. 

Perhaps the most remarkable fact about this statement is that as the 
volume shrinks to zero, the mass of the wrapped BPS state does as well. If 
the volume of the relevant p-cycle parametrizes a continuous set of super- 
symmetric vacuum states, then this conclusion is exact and can be believed 
in all regimes of coupling even though it was derived by crude semiclassi- 
cal reasoning. Indeed, even if we don’t know the theory we are trying to 
construct, we can still believe in the existence of massless wrapped brane 
states as long as we posit that the SUSY algebra is a symmetry. We will 
make extensive use of this argument in the sequel. 

5 Branes and compactification 

5.1 A tale of two tori 

We are now in a position to study many of the important dualities of 
M-theory, at least at a cursory level. We will not have time to delve here 
into the many computations and cross checks which have convinced most 
string theorists that all of these dualities are exact. Many of the dual- 
ity statements remain conjectures supported by a lot of circumstantial evi- 
dence. Obviously, they cannot be proven until a full nonperturbative form of 
M-theory is discovered. However, an important subclass of the dualities 
can actually be proven in a Discrete Light Cone Quantization (DLCQ) of 
M-theory known as Matrix Theory [31]. It applies to Compactifications 
of M-theory with at least 16 unbroken SUSYs and at least 6 noncompact 
Minkowski spacetime dimensions. In DLCQ, one gives up Lorentz invari- 
ance by compactifying a lightlike direction on a circle. One then gets an 
exact description of M-theory in terms of an auxiliary quantum field theory 
living on a fictitious internal space. All of the duality symmetries rele- 
vant to this class of compactifications (the only ones we will talk about 
in these lectures) can be derived as properties of the auxiliary field theory. 
This includes statements (such as rotation invariance of the Type He theory 
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constructed by compactifying M-theory on a two torus) for which there was 
no other evidence prior to the advent of Matrix Theory. 

We begin by compactifying M-theory on a circle of radius i?io- When 
Rio is much larger than the eleven dimensional Planck length Lp there is 
a good description of the low energy physics of the system in terms of IID 
SUGRA compactified on a circle. The SUGRA Lagrangian incorporates all 
of the low energy states of the system and gives a good approximation to 
their low energy scattering amplitudes. 

As i?io drops below Lp, this description breaks down. An IID low 
energy physicist might guess that the low energy states of the system are 
just the zero modes (on the circle) of the SUGRA fields. This gives lOD 
Type IIa SUGRA, which has the following fields: 

These can be identified as the ten dimensional metric (in string conformal 
frame), the dilaton field which describes local variations of the radius of the 
eleventh dimensional circle, a two form gauge potential which is given by 
the integral of the IID three form around the circle, the three form itself, 
and a Kaluza-Klein one form gauge field. The effective Planck mass, Mp°,of 
this ten dimensional theory is given by 

(M1°)S~Rio(M“)9. (5.1) 

Thus, when Rio is small, the effective lOD SUGRA description breaks down 
at a much lower energy scale than the IID Planck mass. 

In fact, the existence of BPS M2 brane states tells us that there is an 
even lower energy scale in the problem. Gonsider a configuration of an M2 
brane “wrapped on the circle” : 



/x = 0 . . . 9 (5.2) 



= Riot (5-3) 

The main part of the action of an M2 brane is the volume of the world surface 
swept out by the brane, multiplied by the brane tension, which is of order 
Lp^ . For wrapped configurations, this reduces to the ten dimensional area 
swept out by the string a;^(t, a) in units of the string tension Lg ^ ~ i?ioTp^. 
This gives an energy scale for string oscillations ms ~ i/RioMp^ which is 
much smaller than the ten dimensional Planck mass. 

Thus, we are led to expect that M-theory on a small circle is dominated 
at energies below the eleven dimensional Planck scale by low tension string 
states. At the energy scale set by the string length gravitational couplings 
are weak. This can be seen by rewriting the dimensionally reduced ac- 
tion in terms of the string length. The coefficient of the Einstein action 
becomes (Lp/i?io)^T|, indicating that at the energy scale defined by the 
string tension, gravitational couplings are determined by a small dimen- 
sionless parameter, = (i?io/Tp)^. In fact, using the technology 
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of Matrix Theory [32] one can show that in the small gs limit, M-theory 
becomes a theory of free strings. 

There is in fact a unique consistent ten dimensional theory of free strings 
with the supersymmetry algebra of IID SUGRA compactified on a circle 
(the so called IIa algebra). It is the Type IIa superstring. In fact, one 
can directly derive the full Green-Schwartz action for the superstring by 
considering the supermembrane action of [33] restricted to the wrapped M2 
brane configurations above. However, this derivation is entirely classical, 
while the existence of the string and its action actually follow purely from 
SUSY and are therefore exact quantum mechanical results. 

This, the first of many dualities, exhibits the general strategy of the 
duality program. Starting from a limiting version of M-theory valid only 
in a certain domain of moduli space and/or energy scale we exhibit some 
heavy BPS state whose mass can be extrapolated into regimes where the 
original version of the theory breaks down, and goes to zero there. We then 
find that the effective theory of these new light states is another version of 
the theory. For the most part, we find only weakly coupled string limits and 
limits where IID SUGRA is a valid approximation. This is exactly true if 
we restrict attention to vacuum states with three or more noncompact space 
dimensions and 32 supercharges. With less SUSY there are limiting regimes 
(many of which are called by the generic name F-theory) where we do not 
have a systematic expansion parameter, though many exact results can be 
derived. 

If we try to repeat this exercise on a two torus something really interest- 
ing happens. The new regime corresponds to taking the area of the torus 
to zero with the rest of its geometry fixed. As is well known, up to an 
overall scale, the geometry of a two torus is determined by a parallelogram 
in the complex plane with one side going from zero to one along the real 
axis. This parallelogram describes the periodic boundary conditions which 
define the torus. It is completely fixed by its other side, which is a complex 
number r in the upper half plane, r is called the complex structure of the 
torus. In fact, different rs can describe the same torus. The SL{2, Z) group 
generated by r ^ r -I- I and r ^ ~I/'t maps all complex numbers which 
define the same torus onto each other. 

In the zero area limit, we can define a whole set of low tension strings, 
by choosing a closed path of nontrivial topology on the torus, and studying 
M2 branes wrapped on this path. The inequivalent nontrivial paths on 
the torus are characterized by two fundamental cycles, called a and b in 
Figure 1. A general path consists of going p-times around a and q times 
around b. It turns out that the (p, q) strings with relatively prime integers 
are stable and can be viewed as bound states of the (1, 0) and (0, 1) strings. 
When the integers are not relatively prime the state is not bound. This 
picture is derived from the BPS formula [34] for the string tension, which 
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follows from a classical calculation in IID SUGRA and is promoted to an 
exact theorem by invoking SUSY. The proof that the states with integers 
having a common divisor are not bound is more complicated [29] . 

Something even more interesting occurs when we consider M2 branes 
which wrap the whole torus. A state with an m times wrapped brane has 
energy 



~ mALp^ = (5.4) 

Kb 

in the limit that the area goes to zero. It turns out [29] that the m 
wrapped states are stable against the energetically allowed decay into m 
singly wrapped states. So in the area goes to zero limit we get a new 
continuum. Any other state in the theory can bind with these wrapped 
membranes at little cost in energy. The result is that the states are la- 
belled by a new continuous quantum number in addition to their momenta 
in the eight noncompact dimensions. Even more remarkable (the only ex- 
tant proof of this requires Matrix Theory [6]) is that the new continuum 
is related to the old one by an 50(9, 1) Lorentz symmetry [6]. Thus, in 
M-theory 11 — 2 = 10. 

Since the origin of the new Lorentz group is obscure, we have to resort 
to Matrix Theory again to find out which kind of ten dimensional SUSY the 
theory has^®. It turns out that both of the ten dimensional Weyl spinors 
have the same chirality, and we are in the IIB theory. 

There is of course a weakly coupled string theory with this SUSY algebra; 
the Type lie Green Schwarz superstring. In fact, the SUGRA limit of 
this theory has an SL(2, R) symmetry which one can argue is broken to 
SL{2, Z) by instanton effects. It acts in the expected way on r. Furthermore 
there are actually two different two form gauge potentials, which form an 
SL{2, Z) doublet. Thus we expect to find two different kind of strings, the 
F(undamental) string and the D(irichlet) string. The latter is a soliton, 
whose tension goes to infinity in the weak coupling limit. Gonsulting the 
eleven dimensional picture we realize that the weak coupling limit should 
be identified with the Imr ^ oo limit in which one of the cycles of the torus 
is much smaller than the other. The F(D) string is then identified with the 
M2 brane wrapped around the shorter (longer) cycle. 

This trick of dimensional reduction by 2 — 1 dimensions is interesting be- 
cause it gets around old theorems which stated that Kaluza Klein reduction 



Actually, a consideration of the field content of the low energy theory is enough. 
In particular the fact that variations of the complex structure r over the noncompact 
dimensions should appear as a complex scalar field, is enough to tell us that we are in the 
IIb theory. The ten dimensional Type IIa theory has only a single real massless scalar. 
Matrix Theory is only necessary to prove that the statement is consistent at all energies. 
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cannot produce chirality. It can be generalized in the following interest- 
ing way. Certain higher dimensional manifolds can be viewed as “elliptic 
fibrations”. That is, they consist of an m dimensional base manifold with 
coordinates 2 and a family of two tori t{z) (the area also varies with z), 
making altogether an m -I- 2 dimensional manifold. Now one varies param- 
eters in such a way that the area of the two tori all shrink to zero. Naively 
this would give a dimensional reduction by two dimensions. However, given 
enough SUSY one can again verify that an extra noncompact dimension 
appears in the limit so that the result is dimensional reduction by one. The 
name given to this general procedure is F-theory [13]. It is very useful for 
describing strong coupling limits of the heterotic string. 

If we try to pull the shrinking torus trick in 3 dimensions we run into a 
disappointment. The new low tension state which appears is a membrane 
obtained by wrapping the M5 brane around the torus. The effective low 
energy theory is then M-theory again with a new Planck length defined in 
terms of the light membrane tension. Indeed it can be shown [5, 35] that 
for three or more noncompact dimensions the only limiting theories one can 
obtain without breaking any SUSY are the Type II string theories and IID 
SUGRA. We will actually prove this theorem below in our discussion of 
extreme limits of the moduli space. 

For my last example of a duality I will study the moduli space of 
M-theory compactifications which break half of the IID SUSY. This is 
achieved by compactifying on four dimensional spaces called K3 manifolds. 
We will have to understand a little bit about the geometry of such mani- 
folds, but I promise to keep it simple. The equation for the SUSY variation 
of the gravitino is 

6'ipf^ = D^e. (5.5) 

This must vanish for certain values of the SUSY parameters e in order to 
leave some SUSY unbroken. A consistency condition for this vanishing is 

Rf.aabe = 0 (5.6) 

where is the curvature tensor in an orthonormal frame and dab are the 
spin matrices in the Dirac spinor representation. We will always be dealing 
with strictly Euclidean n dimensional manifolds so these are generators of 
0{n). 

In two and three dimensions, the spinor has only two components and 
the generators are the Pauli matrices. The only solution of (5.6) is to set the 
curvature equal to zero, but then we do not break any SUSY. We can do bet- 
ter with four compact dimensions. The group SO{4^ = SU{2) x SU{2) has 
two different two dimensional spinors (familiar to particle physicists after an 
analytic continuation as left and right handed Weyl spinors), transforming 
as (1,2) and (2,1) under the two SU{2) subgroups. Thus, if the curvature 
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lies in one of these two subgroups and we choose e to be a singlet of that 
subgroup, then the consistency condition is satisfied. 

The stated condition on the curvature tensor is 

r>ab ^abcd rtcd /c: 

It is easy to see, using one of the standard identities for the Riemann tensor, 
that this implies that the Ricci tensor, vanishes. Thus, 

insisting that half the SUSY is preserved implies that the manifold satisfies 
the vacuum Einstein equations (Euclidean) or is, as we say, Ricci flat. 

There is one more immediate consequence of the SUSY equations which 
I want to note. Just like the spinor representation, the second rank an- 
tisymmetric tensor representation of SO{4:) breaks up into a direct sum 
of (1,3) and (3, 1). Thus, there will be, on a manifold which preserves half 
the SUSY, three independent covariantly constant (and therefore closed and 
nowhere vanishing) two forms, Modulo some technical questions which 
we will ignore, this implies that the manifold is hyperkahler. Compact, four 
dimensional hyperkahler manifolds are called K3 manifolds (this is part of 
an elaborate joke having to do with the famous Himalayan peak K2). 

Noone has ever seen a K3 metric, but mathematicians are adept at deal- 
ing with objects they can’t write down explicitly. We will only need a tiny 
bit of the vast mathematical literature on these spaces. In particular, I want 
to remind you of the famous de Rham theorem, which relates topologically 
nontrivial submanifolds in a space to the cohomology of differential forms. 
Remember that a differential form is just a totally antisymmetric tensor 
multiplied by Grassmann variables that mathematicians call differentials 

w = w^i.../xpda:^U . .da;^p. (5.8) 

The operator 

d=^dx>^ (5.9) 

maps p-forms into p-|- 1-forms and satisfies d^ = 0. This defines what math- 
ematicians call a cohomology problem. Namely, one wants to characterize 
all solutions of dw = 0, modulo trivial solutions of the type oj = dtp (such 
trivial solutions are called exact forms) where tp is a, well defined p — 1-form. 
This is a generalization of finding things with zero curl which cannot be 
written as gradients. A well known physics example is a constant magnetic 
field on the surface of a sphere or a torus. The set of closed but not exact 
p-forms is called the cohomology at dimension p. 

The importance of p-forms stems from the fact that their integrals over 
p-dimensional submanifolds are completely defined by the differential topol- 
ogy of the manifold. No metrical concepts are needed to define these 
integrals. 




T. Banks: M-Theory and Cosmology 



523 



Another important concept is that of a nontrivial p-cycle on a manifold. 
Basically this is a p-dimensional submanifold which cannot be shrunk to a 
point because of the topology of the manifold. The simplest examples are 
the a and b 1-cycles on the torus of Figure 1. Actually, it is an equivalence 
class of submanifolds because any curve which circles around the a cycle 
and then does any kind of topologically trivial thing on the rest of the torus 
is equivalent to the a cycle, de Rham’s theorem tells us that there is a one 
to one correspondence between p-cycles and p-forms, as we have mentioned 
above. 

After that brief reminder, we can turn to the question of what the co- 
homology of K3 manifolds is. Since it is a topological question we can 
answer it by examining an example. Every 4-manifold has cohomology at 
dimension 0 (the constant function) and dimension 4 (the volume form. 
The simplest K3 manifold is the “physicists K3”, the 
singular orbifold /Z^. This is defined by taking a rectilinear torus with 
axes 2TrRi and identifying points related by x* ^ ±x* -I- 2niTrRi. This has 
16 fixed points in the fundamental domain: a;* = Ri{l ± 1)7t/4. The space 
is flat except at the fixed points but has curvature singularities there. It 
can be verified that the holonomies around the fixed points are in a single 
SU{2) subgroup of 0(4) so the space is a K3. 

It is easy to see that the nontrivial one cycles on the torus all become 
trivial on the orbifold (the corresponding one forms are odd under the orb- 
ifold transformation and are projected out). The torus has six obvious 
2 cycles, which are the six different T^s in the T^. In addition, when one 
studies this singular manifold as a limit of smooth K3’s by the methods of 
algebraic geometry (realizing the manifold as the solution set of polynomial 
equations) one finds that each of the fixed points is actually a two sphere of 
zero area. Thus there are twenty two non-trivial two cycles on a K3 mani- 
fold. By the de Rham theorem, there are twenty two linearly independent 
elements of the cohomology at dimension two of K3. 

One can introduce a bilinear form on two forms in a four manifold. The 
product of two two cycles is a four form, which can be integrated over the 
manifold. Define: 

lij = j ujiujj. (5.10) 

Remember that f *ujoj is the usual Euclidean Maxwell action for a two 
form field strength is thus positive definite. The form I is thus negative 
on antiselfdual tensors and positive on self dual ones. We have already 
established that there are three independent antiselfdual covariantly con- 
stant (and therefore closed but not exact) two forms. It can be shown 
that the rest of the cohomology consists of self dual two forms, so that 
/ has signature (19,3). A basis can be chosen in which it has the form 
I = CTi © CTi 0 (Ti 0 Eg 0 Eg, where a\ is the familiar Pauli matrix and Eg 
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is the Cartan matrix of the Lie Group Eg (the matrix of scalar products of 
simple roots). 

This is very suggestive. The heterotic string compactified on a three 
torus, has nineteen left moving and three right moving currents (the sixteen 
Eg X Eg, gauge currents and linear combinations of the momentum and 
winding number currents on the torus). Indeed, Narain [16] introduced the 
same scalar product, where left movers have opposite signature to right 
movers, in his study of heterotic compactifications on tori. At this point, 
readers who are not familiar with the heterotic string will undoubtedly 
benefit from... 



5.2 A heterotic interlude 

The bosonic string “lives” in 26 bosonic dimensions, while the superstring 
lives in 10. This discrepancy has two sources, both of which have to do 
with the difference between the world sheet gauge groups of the two theo- 
ries. The bosonic string has only worldsheet diffeomorphism invariance and 
the 26 is required to cancel the anomaly in this symmetry against a corre- 
sponding anomaly coming from Fadeev-Popov ghosts. Type II superstrings 
have worldsheet supergravity^®. On the one hand, this requires the embed- 
ding coordinates to have superpartners which also contribute to the 
anomaly. On the other hand, since the world sheet gauge group is larger, 
there are more ghosts. The net result of these two effects is to reduce to 10 
the maximal number of Minkowski dimensions in which the Type II strings 
can propagate. Smaller numbers of can be achieved by compactification. 

In two dimensions, the smallest SUSY algebra is called (1,0) and has 
a single right moving spinor supercharge. There is a corresponding chiral 
worldsheet supergravity. Type II strings have the vector like completion of 
this, (1, 1) SUSY, which consists of one left moving and one right moving 
supercharge. The heterotic string is defined as a perturbative string the- 
ory with only (1,0) worldsheet SUSY. Its maximal number of Minkowski 
dimensions is 10. 

In ten Minkowski dimensional target space, the world sheet field theory 
of any string theory is a collection of free massless fields each of which can be 
separated into its left and right moving components. The ten dimensional 
heterotic string has 10 right moving A'^s and their superpartners, and 26 left 
moving A'^s. In order to eliminate an extra continuum from the 16 extra 
bosonic dimensions, we can compactify them on a torus. This simply means 
that we eliminate all states which are not periodic functions in these 16 
coordinates. 



^®Not to be confused with spacetime supergravity, which is another beast entirely. 
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The restriction to toroidal compactification in fact follows from a deeper 
principle. The construction outlined so far was a consistent gauge fixed 
quantum theory with infinitesimal (1,0) superdiffeomorphism invariance in 
two dimensions. We have seen above that perturbative string theory re- 
quires us to evaluate the world sheet path integral on Riemann surfaces of 
arbitrary genus. For genus one and higher, there are disconnected pieces of 
the diffeomorphism group and we must require invariance under those as 
well. This is called the constraint of modular invariance. It turns out that 
this restriction is satisfied iff we choose the sixteen dimensional torus to be 
the Cartan torus of one of the groups Eg. x Es or SO{32). The operators 
{dr — dcr)X'^^'^ , with f = 1 ... 16, are then the current algebra for the C/(l)^® 
Cartan subgroup. The currents corresponding to raising and lowering op- 
erators of the group have the form of exponentials where Vi are the 

roots of the algebra. 

If we further compactify the heterotic string on a d-torus, we get d pairs 
of U{1) currents (which, for generic radii of the torn are not completed to a 
nonabelian group) from {dr ± da-)X^. Half of these are purely left moving 
and the other half purely right moving. The vectorlike combinations are 
simply the Kaluza Klein symmetries expected when compactifying GR on 
a torus. The axial combinations couple to the winding number of strings 
around the d torus. Perturbative string theory always has a two form gauge 
field B, which couples to the string world sheet as /^^oridsheet When 
integrated around the d 1-cycles of the torus, it gives rise to d one forms 
which couple to string winding number. In addition to this gain in the 
rank of the symmetry group, a generic toroidal compactification will lose 
the nonabelian parts of the group. This is because we can have Wilson 
lines around the cycles of the torus. Thus, at a generic point on the moduli 
space of toroidal compactifications of the heterotic string, the gauge group 
isC/(l)^®“'"‘^x[/(l)‘^, where we have separated the contributions coming from 
left and right moving currents. 

Thus, one way of viewing toroidally compactified heterotic string theory 
is to say that it consists of the modes of 16-1- d left moving and d right moving 
scalar fields, where the zero modes of these fields live on independent tori 
(there are also fermionic partners and fields representing the noncompact 
dimensions, but we do not need to discuss them here). A given compact- 
ification can then be specified by talking about the allowed values of the 
dimensionless momenta around the torus, a discrete set of numbers l^). 
One must insist that the vertex operators with any allowed momenta are 
all relatively local on the world sheet in order that the expressions for 
tree level string amplitudes make sense^®. Furthermore one must impose a 



and fT are worldsheet coordinates and the X* satisfy (dr + Sct)X® = 0. 

^®Left moving or right moving fields are not local operators. The vertex operators are 
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condition called ihodular invariance to guarantee that one loop amplitudes 
make sense. These conditions turn out to be equivalent [16] to the restriction 

(lL)2 _ (iR)2 g 2Z (5.11) 

combined with the requirement that the lattice of all possible momenta be 
self duab®. Such lattices turn out to be unique up to an 0(16+fi, d) rotation. 
It can be shown that the parameters of these rotations are equivalent to 
choices of background Wilson lines, constant andtisymmetric tensor fields 
on the d torus and the choice of the flat metric on the torus. 

Thus, heterotic string theory compactifled on a d-torus with generic 
Wilson lines, has a natural 0(16 + d,d,Z) invariant scalar product de- 
fined on the space of worldsheet currents. The fact that the same sort of 
scalar product arises as the intersection matrix of cohomology classes of K3 
manifolds is the first hint that the two systems are related. 

5.3 Enhanced gauge symmetries 

One of the reasons Type II strings did not receive much attention after 
the discovery of the heterotic string was that they did not appear to have 
the capability of producing gauge groups and representations like those of 
the standard model. The same was true of IID SUGRA. However, the 
suggestive connection with heterotic strings leads one to suspect that a 
mechanism for producing nonabelian gauge symmetries has been overlooked. 
The theory of singularities of K3 manifolds was worked out by Kodaira and 
others in the 1950’s. It turns out that one can characterize singular K3 
manifolds in terms of topologically nontrivial cycles which shrink to zero 
size. The singularity is determined by the intersection matrix I restricted 
to the shrinking cycles. It turns out that in almost all cases, the resulting 
matrix was the Cartan matrix of some nonabelian Lie group. In the purely 
mathematical study of four manifolds, there is no way to understand where 
the Lie group is. 

However, viewed from the point of view of M-theory compactiflcation on 
K3, a nonabelian group jumps into view. Indeed, imagine BPS M2 branes 
wrapped around the shrinking cycles of the singularity. These will be mass- 
less particles in the uncompactifled seven dimensional spacetime. Since we 
have 16 SUSYs in the effective seven dimensional theory these must include 
massless vector fields, since the smallest representation of this SUSY alge- 
bra is the vector multiplet. Furthermore, even away from the singularity. 



exponentials of these fields and are generally not local either. But certain discrete subsets 
of these vertex operators are relatively local. 

^^The dual of a lattice with a scalar product defined on it is the set of all vectors which 
have integer scalar product with the vectors of the original lattice. 
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we have 22 17(1) vector multiplets. Indeed one can write three form poten- 
tials in IID SUGRA of the form A^j^vxdx^dx''dx^ = o(j(A)dA^Wi, where 
oji are the 22 independent harmonic two forms on K3, and X are the seven 
noncompact coordinates. The are gauge potentials in a product of [/(I) 
algebras which will be the Cartan subalgebra of the nonabelian group that 
appears at the singularity. Since membranes are charged under the three 
form, we see, using the de Rham connection between forms and cycles, that 
the new massless vector bosons are charged under the Cartan subalgebra, 
i.e. we have a nonabelian gauge theory. 

One further point of general interest. As is obvious from the jZi orb- 
ifold example, the singularities that give rise to nonabelian gauge groups live 
on manifolds of finite codimension (or branes) in the compact space. If the 
volume of the compact space is large, this will lead to a large ratio between 
the gauge and gravitational couplings in the noncompact effective field the- 
ory. We will discuss the phenomenological implications of this observation 
in the context of the Hofava Witten scenario below. 

The emergence of nonabelian gauge theory from singularities is one of 
the most beautiful results of the M-theory revolution. It combines Wilson’s 
observation that singularities in the free energy functional at second order 
phase transitions could be correlated with the appearance of massless states, 
with the mysterious occurrence of Lie groups in singularity theory, uniting 
physics and mathematics in a most satisfying fashion. One can go much fur- 
ther along these lines. When studying singularities of Calabi-Yau manifolds 
of dimension three or four one encounters cases which cannot be explained 
in terms of gauge theory, but which do have an explanation in terms of 
nontrivial fixed points of the renormalization group. The interplay between 
SUSY, singularity theory, and the theory of the renormalization group in 
these examples, is a stunning illustration of the power of M-theory [17]. 

So far, we have seen how enhanced gauge symmetries arise from 
M-theory on K3 but have not yet delivered on our promise to make a con- 
nection with the heterotic string. We have seen in toroidal examples that 
the key to string duality is the existence of light BPS states when cycles of 
a manifold shrink to zero. The limit of M-theory on K3 which gives rise to 
weakly coupled heterotic string theory (on a torus, T^) is the limit where 
the K3 volume shrinks to zero in Planck units. The M5 brane wrapped 
around the K3 gives rise to a low tension string in this limit [14]. Re- 
call that the world volume of the fivebrane carries an antisymmetric tensor 
gauge field with self dual 3 form field strength, H = *H, which satisfies 
dH = 0. For configurations of the five brane wrapped on K3 one can study 
configurations of H of the form H = jiUJi, where ji is a world volume one 
form which depends only on the two coordinates of the world volume which 
are not wrapped on K3, and ui is one of the 22 harmonic forms on K3. In 
order to satisfy H = *H, ji must satisfy ji = Ci * ji, where uji = Ci * coi 
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(recall that 19 of the forms on K3 have = 1, while for the other three 
it is negative). dH = 0 implies dji = 0. In more familiar notation, the 
string formed from the M5 brane wrapped on K3 will have 19 left moving 
(j“ = and 3 right moving conserved currents. This is precisely the 
bosonic field content of the (bosonic form of) the heterotic string on a three 
torus. The evident SUSY of the wrapped brane configuration guarantees 
the existence of the appropriate world sheet fermions. 

The heterotic string was discovered as a solution to the consistency con- 
ditions of perturbative string theory. Though it was obviously the pertur- 
bative string most closely connected to real physics, no one ever claimed 
that it was beautiful. The derivation of its properties from the interplay 
between the K3 manifold and the M5 brane of IID SUGRA can make such 
an aesthetic claim. It is another triumph of string duality. 

This construction automatically gives rise to the heterotic string com- 
pactified on a three torus. Note that again the geometry of K3 disappears 
from the ken of low energy observers and is replaced by a space of a differ- 
ent dimension and topology. Following Aspinwall [15] we can try to recover 
the ten dimensional heterotic string from the K3 picture. The mathemat- 
ics is somewhat complex but in the end one recovers the picture of Hofava 
and Witten [43] (if one is careful to keep the full Ag x Ag gauge symmetry 
manifest at all times). That is, one finds IID SUGRA compactified on an 
interval, with Ag gauge groups living on each of two 10 dimensional bound- 
aries^°. The heterotic string coupling is related to the size of the interval, 
L, by gs = 

The Hofava- Witten description of the strongly coupled heterotic string 
in ten dimensions was originally motivated by considerations of anomaly 
cancellation and matching onto various weakly coupled string limits. It is 
somewhat more satisfying to realize it as a singular limit of compactification 
of M-theory on a K3 manifold. 

Witten [44] has pointed out that the strong coupling limit of this pic- 
ture can resolve one of the phenomenological problems of weakly coupled 
heterotic string theory. Among the few firm predictions of heterotic pertur- 
bation theory is the equality between the gauge coupling unification scale 
M and the four dimensional Planck scale mp. In reality these differ by 
a factor of 100. Gareful consideration of threshold corrections brings this 
discrepancy to a factor of 20, but one may still find it disturbing. Witten 
points out that in the picture of IID SUGRA on an interval it is easy to 
remove this discrepancy. Indeed, if we compactify the system to four di- 
mensions on a Galabi-Yau 3- fold of volume then the four dimensional 



20 As one takes the limit corresponding to infinite three torus, one is forced to K3 
manifolds with two E% singularities. 

Actually, due to details which we will not enter into, the Calabi-Yau volume varies 
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gauge couplings are given approximately by ~ (Ve/Lp), while the four 
dimensional Planck mass is given by nip ~ {LV^/Lp). Tuning L and the 
volume to the experimental numbers (and taking into account various nu- 
merical factors, gives a linear size for the 3-fold of order 2Lp and L ~ 70Lp. 
The unification scale M is of order Lp^. 

I want to emphasize three features of this proposal. First, the four di- 
mensional Planck mass is not a fundamental scale of the theory. Rather, it 
is the unification scale, identified with the eleven dimensional Planck scale, 
which plays this role. Secondly, since we have seen that gauge groups gener- 
ically arise on branes in M-theory, Witten’s proposal may be only one out of 
many possibilities for resolving the discrepancy between the unification and 
Planck scales. A possible advantage of a more flexible scenario might be the 
elimination of all large dimensionless numbers from fundamental physics, in 
particular the factor of 70 in Witten’s scenario. If the codimension of the 
space on which the standard model lives is large, then the factor of order 
100 which is attributed to the volume of this space in Planck units, might 
just be 2®. Finally, let us note that in this brane scenario, the bulk physics 
enjoys a larger degree of SUSY (twice as much) than the branes. This will 
be useful in our discussion of inflationary cosmology below, and may also 
help to solve the SUSY flavor problem [45,46]. 



5.4 Conclusions 

In this brief summary of M-theory and its duality symmetries, we have seen 
that classical geometry can undergo monumental contortions while the the- 
ory itself remains smooth. When there are enough noncompact dimensions 
and enough supersymmetry, there are exact moduli spaces of degenerate 
vacua which interpolate between regions which have very different classical 
geometric interpretations by passing through regions where no geometrical 
interpretation is possible (for the compact part of the space). The most 
striking example is perhaps the K3 compactification, where the 80 geomet- 
rical parameters describing K3 manifolds are interpreted in an appropriate 
region of the parameter space as the geometry, and background gauge and 
antisymmetric tensor fields, of a three torus with heterotic strings living 
on it. The clear moral of the story is that “geometry is in the eye of the 
(low energy) beholder”, and must actually be a low energy approximation 
to some other concept, which we do not as yet understand. 

Equally important for our further discussion is that the modular pa- 
rameters interpolate smoothly between different geometrical regions and 
exist even in regions which can not be described by geometrical concepts. 



along the interval. The parameter Vq is its value at the end of the interval where the 
standard model gauge couplings live. 
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In different regimes of moduli space, the moduli can be viewed as zero 
modes of different low energy fields living on different background geome- 
tries. But, although their interpretation can change, the moduli remain 
intact, and (with enough SUSY), their low energy dynamics is completely 
determined. In subsequent sections we argue that they are the appropriate 
variables for discussing the evolution of the universe. 

6 Quantum cosmology 

6.1 Semiclassical cosmology and quantum gravity 

In today’s lecture we will leave behind our brief survey of M-theory and 
duality and proceed to cosmological questions. We will begin by discussing 
some “fundamental” issues in quantum cosmology and proceed to a some- 
what more practical application of M-theory to inflationary models. None 
of this work will lead to the kind of detailed model building and compar- 
ison with observation that is the bread and butter of most astroparticle 
physics. In my opinion, the current theoretical understanding of M-theory 
does not warrant the construction of such detailed models. Detailed in- 
flationary model building requires, among other things, knowledge of the 
infiaton potential. In an M-theory context this means that we have to have 
control over SUSY breaking terms in the low energy effective action. Even 
the advances of the last few years have not helped us to make significant 
progress in understanding SUSY breaking. 

My aim in these lectures will be to address general questions like what 
the infiaton is likely to be, the relation between the energy scales of inflation 
and SUSY breaking, the connection between various scales and pure num- 
bers encountered in cosmology with the fundamental parameters, and so 
on. We will see that a rather amusing picture can be built up on this basis, 
which is quite different from most conventional cosmological models. I will 
concentrate here primarily on my own work (and that of my collaborators) 
rather than trying to give a survey of all possible approaches to cosmology 
within M-theory. Professor Veneziano will be giving a detailed exposition 
of one of the other major approaches, so between the two of us you will get 
some idea of what is possible. 

The discussion will be divided into two parts, one more “fundamental” 
and the other more “practical” . The aim of the first part will be to pose 
the problem of how the conventional equations of cosmology may eventu- 
ally be derived from a fully quantum mechanical system. We will also begin 
to address the question of why M-theory does not choose one of its highly 
supersymmetric vacua for the description of the world around us. We end 
this exposition by introducing a heterodox antiinflationary cosmology. The 
“practical” section will concentrate on issues related to moduli and SUSY 
breaking. We will see that cosmological considerations suggest a vacuum 




T. Banks: M-Theory and Cosmology 



531 



structure similar to that proposed by Hofava and Witten, and put further 
constraints on the form of SUSY breaking. One also obtains an explana- 
tion of the size of the fluctuations in the microwave background in terms of 
the fundamental ratio between the unification and Planck scales. We will 
conclude with an inflationary cosmology very different from most of those 
in the literature. Among its virtues is the possibility of supporting a QCD 
axion with decay constant as large as the fundamental scale. Indeed, the 
assumption that such an axion exists gives an explanation of the temper- 
ature of matter radiation equality in terms of the fundamental parameters 
of the theory. 

We will begin our discussion of “fundamental” cosmology by recalling 
the treatment of quantum cosmology in GR. One of the more bizarre con- 
sequences of an attempt to marry GR to QM is the infamous Problem of 
Time. A generally covariant theory is constructed for the precise purpose of 
not having a distinguished global notion of time. In classical mechanics this 
is very nice, but quantum mechanically it turns the conventional Hamilto- 
nian framework on its head. The problem can be seen in simple systems 
with time reparametrization invariance, that is, actions of the form 

J dteL{q,q/e) (6.1) 

where q represents a collection of variables which transform as scalars under 
time reparametrization, and e is an einbein {i.e. edt is time reparametriza- 
tion invariant). We can use the symmetry to set e equal to a constant 
(gauge fixing), but the e equation of motion then says that the canonical 
Hamiltonian of the q vanishes. 

pjj 

H = q^-L = 0. (6.2) 

oq 

In simple covariant systems like Ghern-Simons gauge theory, one can solve 
this Hamiltonian constraint and quantize the system in the sense that the 
classical observables are realized as operators in a Hilbert space. However, 
the notion of time evolution is still somewhat elusive. More generally, in 
realistic systems where the constraints are not explicitly soluble, one recov- 
ers time evolution by finding classical variables. For example, if spacetime 
has a boundary, with asymptotically flat or asymptotically Anti deSitter 
boundary conditions, then one can use one of the symmetry generators of 
the classical geometry at infinity as a time evolution operator. 

In cosmology one generally does not have the luxury of a set of variables 
whose quantum fluctuations are frozen by the boundary conditions. The no- 
tion of time evolution is tied to a semiclassical approximation for a particular 
set of variables. Different cosmological evolutions may not be described by 
the same semiclassical variables. One of the challenges of this framework 




532 



The Primordial Universe 



is to find a generic justification for the semiclassical approximation. To see 
how the idea works, one “quantizes” the goo Einstein equation by writing it 
in Hamiltonian form and naively turning the canonical momenta into dif- 
ferential operators (at the level of sophistication of this analysis, it does not 
make sense to worry about ordering ambiguities). This gives the Wheeler- 
DeWitt equation, a second order PDE which is supposed to pick out the 
physical states of the system inside a space of functionals of the fields on 
a fixed time slice. The challenge is to put a positive metric Hilbert inner 
product on the space of physical states and identify a one parameter group 
of unitary operators that can be called time evolution. 

It is well known that, viewed as a conventional field theory, the confor- 
mal factor of the gravitational field has negative kinetic energy. In quantiza- 
tion of GR in perturbation theory around any classical solution of the field 
equations, the negative modes are seen to be gauge artifacts and a positive 
definite Hamiltonian is found for gravitons. 

In general closed cosmologies, the analogous statement is the following: 
the Wheeler DeWitt constraint completely eliminates all negative modes 
from the physical Hilbert space. It is convenient to think of GR in syn- 
chronous gauge, where the goi components of the metric vanish and goo = 1- 
Such a gauge is built by choosing a spacelike hypersurface and following 
timelike geodesics orthogonal to this hypersurface to define the evolution 
into the future. It can then be shown that all of the negative modes repre- 
sent the freedom to change the choice of the initial hypersurface (the many 
fingered time of GR). The Wheeler-DeWitt equation is then the constraint 
which says that physics must be independent of this choice. It is often 
convenient to solve the contraint in stages. Namely, among all spacelike 
surfaces in a given spacetime geometry, there are one parameter families 
related to each other by propagation along orthogonal timelike geodesics. 
The choice of such a family eliminates all but one of the negative modes, 
the last one being related to the choice of which surface in the family is 
called the initial surface. That is, it is related to the time as measured by 
observers following the timelike geodesics which define the family^^. It can 
be chosen to be any monotonic function along these trajectories, and it is 
often convenient to choose the volume of the spatial metric. 

The upshot of all this, is that once a family of hypersurfaces is chosen, 
one still has a single component of the Wheeler-DeWitt constraint which has 
not yet been imposed. Glassically this is the familiar Friedmann equation 
relating the expansion rate to the energy density. A naive quantization 
of this equation gives a hyperbolic PDE on a space with signature (l,n). 



^^This discussion is purely classical, but mirrors the less intuitive mathematical oper- 
ations which one carries out in semiclassical quantization of the WD equation. 
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A form of this equation sufficiently general for our purposes is 

hG‘"'"{m)dadb + g^^{q,m)dAdB + \v{m) + U{q,m) 'I' = 0. (6.3) 

h 

We have separated the variables into classical (m“) and quantum and 
introduced a formal parameter h to organize the WKB like approximation 
for the classical variables. The metric G is hyperbolic with one negative 
direction, while the metric g has Euclidean signature. The analysis we are 
presenting goes back to [37]. Up to terms of order h, it is easy to see that 



the solution of this equation has the form 

T(m, q) = A{m)'tp{m, q) (6.4) 

where 

-G“*'Va5V,,S' + U = 0 (6.5) 

G'^\VaSVbA + AVaVbS = G) ( 6 . 6 ) 

iG’^^VaSVb^ + HiP = 0 (6.7) 

H = g^^WA^B + U. ( 6 . 8 ) 



The first of these equations has a Hamiltonian- Jacobi form. It can be solved 
by finding classical motions m“(t) = G“^VaS'(m). S is then the action of 
the classical solution, and (6.5) is satisfied if the solution has zero “energy”. 
The existence of real zero energy solutions (and thus real S) depends on the 
fact that G“** has nonpositive signature. 

Using the classical solution we recognize that (6.7) can be written 

as a conventional Schrodinger equation: 

= (6.9) 

Positivity of the Hamiltonian (6.8) requires that gAB have Euclidean sig- 
nature. Note that since H depends on the m’s, the Hamiltonian will in 
general be time dependent. Furthermore, the quantum fluctuations of the 
classical variables m will have a sensible Hamiltonian only if the metric Gab 
has only a single negative eigenvalue. Thus, we see that within the naive 
approach to quantization of Einstein’s equations, the existence of a Hilbert 
space interpretation of the physical states, with a positive definite scalar 
product and a unitary time evolution with a sensible Hamiltonian operator, 
is closely tied to the fact that Einstein’s equations coupled to matter with 
positive kinetic energy have a hyperbolic metric with exactly one negative 
eigenvalue (after gauge fixing). 

One may wonder whether these observations will survive in a more re- 
alistic theory of quantum gravity. We know that Einstein’s action is only a 
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low energy effective description of M-theory. Even those heretics who refuse 
to admit that M-theory is the unique sensible theory of quantum gravity^^ 
are unlikely to insist that quantization of this famously nonrenormalizable 
field theory by the crude procedure described above is the final word on the 
subject of quantum gravity. I would like to present some evidence that in 
M-theory the (1, n) signature of the metric on the space of classical variables 
is indeed guaranteed by rather robust properties of the theory. 

Before doing so I want to point out how M-theory addresses the ques- 
tion of the existence of semiclassical variables m“. There are actually two 
desiderata for the choice of such variables: we want the semiclassical ap- 
proximation for these variables to be valid during most of cosmic history^"^. 
Secondly, given the notion of energy implied by the classical solution for 
the m“, one often wants to be able to make a Born Oppenheimer approxi- 
mation in which the m“ are slow variables or collective coordinates. Note 
that the classical nature of the m“ is crucial, while the Born-Oppenheimer 
approximation is not. Without classical variables we would have no notion 
of time evolution. The Born-Oppenheimer approximation allows us to dis- 
cuss the evolution of the classical variables in terms of an effective action in 
which other degrees of freedom are ignored. This is particularly useful be- 
low the Planck energy, since we have no idea how to describe the full set of 
degrees of freedom of the theory in the Planck regime, but are comfortable 
with a quantum field theory description below that. Nonetheless, we will 
argue below that the classical variables might still provide a useful notion 
of time evolution during the Planck era, as long as the variable which we 
identify as the spatial volume of classical geometry below the Planck scale, 
is large. It is important for any such pre-Planckian endeavor that M-theory 
gives an unambiguous meaning to (at least highly supersymmetric) moduli 
spaces even in regimes not describable by low energy Einstein equations. 
In a regime of super-Planckian energy and large volume, one would have 
to know something about the dynamics and the state of all the degrees of 
freedom in the system to understand how they effect the evolution of the 
classical variables. 

As suggested in the last paragraph, both classicality and slow evolution 
can be understood in M-theory if we identify the m“ as moduli, though 



^®One hopes that the world has not come to a state in which one has to emphasize 
that a sentence like this is a joke, but let me record that fact in this footnote just to be 
on the safe side. 

^^As Borges pointed out long ago [7] it is almost impossible to avoid self referential 
paradoxes when trying to conceptualize a system in which the notion of time is an il- 
lusion or an approximation. According to the paragraphs above, cosmic history and its 
implied notion of time only exist because of the classical nature of the m“ . Rather than 
attempting the impossible task of being logically and linguistically precise, I will make 
the common assumption that “any sensible physicist who has followed my discussion 
understands exactly what I mean by these imprecise phrases” . 
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with a slightly different definition of that word than the usual “parameters 
describing continuous families of supersymmetric vacua with d> A asymp- 
totically fiat dimensions” . In a theory of quantum gravity, SUSY can only 
be defined nonperturbatively if we insist on studying states with certain a 
priori boundary conditions. The SUSY charges, just like the Hamiltonian, 
are defined as generators of certain asymptotic symmetries of the whole 
class of metrics satisfying the boundary conditions. However, if we re- 
strict attention to the classical SUGRA equations, then we can define what 
we mean by solutions which preserve a certain amount of supersymmetry. 
Since the Hamiltonian appears in the SUSY algebra, they will all be static 
solutions. To find them, we simply require that certain SUSY variations 
of all the fields vanish at the solution. Typically, we find a moduli space 
of continuously connected solutions preserving a particular SUSY subalge- 
bra. The parameters are coordinates on this space. In particular, for 
IID SUGRA, each solution will be a static, compact ten geometry, and the 
volume of the compact space, V will be one of the moduli. 

Now consider classical motions in which the become functions of 
time. The effective action derived by plugging such time dependent moduli 
into the SUGRA action has the form 

S = {t) (6.10) 

where e is an einbein which imposes time reparametrization invariance. G 
is a hyperbolic metric with signature (1, n). In fact, it is easy to see that the 
only modulus with negative kinetic energy is the volume V . This is because 
our choice of parametrization of the spacetime metric has implicitly chosen 
a family of spacelike hypersurfaces in these spacetimes (those of constant 
t). The constraint equation coming from varying e can be written 

^ = G'abTO^m*' (6.11) 

and is just the Friedman equation for a Robertson Walker cosmology. The 
hatted quantities stand for the moduli space of SUSY solutions with 
volume 1. 

It is easy to prove (and well known to those who have studied cosmologies 
with minimally coupled massless scalar fields) that the field equations of this 
system give, for the Volume variable, exactly the equations of a Robertson- 
Walker universe with equation of state p = p. The “energy density” p 
then scales like 1/U^. The m variables satisfy the equations of geodesic 
motion in the metric G, under the influence of cosmological friction. This is 
equivalent to free geodesic motion in the reparametrized time s defined by 
ds/dt = U“ 2 . The volume is always monotonically decreasing or increasing 
in these solutions. The derivation of these facts is an enjoyable exercise in 
classical mechanics which I urge the students to do. 
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Finally I want to note that under the transformation V —> cV, the 
action scales as S' ^ cS. Thus, Planck’s constant h can be absorbed in 
V, and the system is classical at large V. I want to emphasize that the 
actual spacetime geometries described by these evolution equations can be 
quite complex. That is, the might parametrize a set of Calabi-Yau 
manifolds. However the simple properties of the evolution on this moduli 
space described above are unaffected by the complexity of the underlying 
manifolds. 

The point of all of these classical SUGRA manipulations is that, given 
enough SUSY, there are nonrenormalization theorems which protect this 
structure in regimes where the classical SUGRA approximation is invalid. 
For example, if there are 16 or more SUSYs preserved, then one can prove 
that there is no renormalization of the terms with < 2 derivatives in the 
effective Lagrangian for the moduli, to all orders in the expansion around 
classical SUGRA. Furthermore, these are the cases where SUGRA is dual to 
Type II (32 SUSYs) or Heterotic (16 SUSYs) string theories, compactified 
on tori. The weak coupling string expansions are in some sense expanding 
around the opposite limit from the SUGRA expansion (extremely small 
volumes, in Lp units, of compact submanifolds rather than extremely large 
ones). To all orders in the weak coupling string expansions one can establish 
that the moduli space exists (i.e. that no potential term is generated in the 
effective Lagrangian for the moduli) and that its topological and metrical 
structure is the same as that given by IID SUGRA^®. 

There is thus ample evidence that there is some exact sense in which 
the configuration space of M-theory contains regions which map precisely 
on to the classical moduli spaces of SUGRA solutions preserving at least 
16 SUSYs. For 8 SUSYs, the situation is a bit more complicated. The well 
understood regions of moduli space here correspond to IID SUGRA (or 
Type II strings) compactified on a manifold which is the product of a Galabi- 
Yau 3-fold and a torus, or heterotic strings compactified on K3 manifolds 
times a torus. Galabi-Yau 3-folds come in different topological classes, but 
there is a conjecture that all of these regions are on one continuously con- 
nected moduli space once quantum mechanics is taken into account. This 
statement depends on the fact that the strong nonrenormalization theorems 
described above are not valid. The metric on moduli space is modified by 
higher order corrections. However, one can still prove a nonrenormalization 
theorem for the potential on moduli space (namely that it is identically 
zero) so that the moduli space still exists as an exact concept. 



■^®If one is willing to decompactify three of the toroidal directions and view the 
remaining moduli as zero modes of fields in 3 -I- 1 dimensional Minkowski space, then 
one can prove these statements from SUSY without recourse to any expansion. It is 
likely that these proofs can be adapted to the completely compactified situation, but this 
has not yet been done. 
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This is all that is needed to establish that the moduli are good candi- 
dates to be the semiclassical, Born-Oppenheimer variables that are neces- 
sary for the derivation of a cosmology from a generally covariant quantum 
system. Indeed, the absence of a potential on moduli space means that 
the classical moduli can execute arbitrarily slow motions with arbitrarily 
low energy. Thus, in regimes where the classical motion has energy den- 
sity small compared to the fundamental scale of M-theory, they are good 
Born-Oppenheimer variables. Furthermore, the V rescaling symmetry of 
the action shows that whenever V is large the moduli will behave classi- 
cally. Indeed this will even be true in regions where the Born-Oppenheimer 
approximation breaks down, that is regions where the energy density is of 
the order of or larger than the Planck scale, but the volume is large. In 
such a regime, a description of the evolution in terms of classical moduli 
coupled to a stochastic bath of high energy degrees of freedom might be 
appropriate. The mystery will be to understand the equation of state of the 
stochastic bath. 

The necessity of coupling the moduli to another, stochastic set of degrees 
of freedom appears also very late in the history of the universe. The modular 
energy density scales to zero much faster than either matter or radiation. 
Thus if there is any mechanism which generates matter or radiation, they 
will quickly dominate the energy density of the universe. In [25] it was 
shown that when the moduli can be treated as the homogeneous modes 
of quantum fields, there is an efficient mechanism for converting modular 
energy into radiation. Thus, at late times, one must study the motion of 
the moduli coupled to a stochastic bath of radiation and/or matter. 

To summarize, the existence of a set of approximately classical, low 
energy collective coordinates which take values in a space of hyperbolic 
signature (l,n) seems to be a very robust property of M-theory. These 
would seem to be just what we need for a derivation of cosmology from 
the theory. From the point of view of someone who is deeply attached to 
“the real world” , the only problem with this analysis is that the universes it 
describes become highly supersymmetric in the large volume limit. We will 
defer the discussion of moduli in the context of broken SUSY to Section 7. 

6.2 Extreme moduli 

In this subsection I will present some results about the beginning and end 
of cosmic evolution in the highly supersymmetric situations I have just de- 
scribed. One motivation for this is to provide a controlled model for more 
realistic cosmologies. Another is to try to address the question with which 
we began these lectures, of why the universe as described by M-theory does 
not end up in a highly supersymmetric state. Finally, we will discover some 
very interesting results about duality and singularities which are closely 
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related to the hyperbolic structure of moduli space and the question of the 
arrow of time. 

We will discuss only the case of maximally SUSY moduli spaces, which 
are obtained by compactifying M-theory on a ten torus. The parameters 
are a flat metric on the torus, and the expectation value of the three form 
potential, on three cycles of the torus. Most of these are compact an- 
gle variables. Among the metric variables, only the radii Ri of a rectilinear 
torus are noncompact, while the three form expectation values are all angle 
variables because of the Dirac-Nepomechie-Teitelboim quantization condi- 
tion [30] (their conjugate momenta are quantized) . Thus, intuitively, we can 
restrict our discussion of the possible extreme regions of moduli space to the 
radii of a rectilinear torus. This argument can be made mathematically pre- 
cise using the description of the moduli space as a homogeneous space. We 
will call the restricted rectilinear moduli space, the Kasner moduli space. 

The metrics which describe motion on the Kasner moduli space have the 
form 

ds^ = — dt^ -I- (t)(da;*)^ (6.12) 

where the have period 2tt. Inserting this ansatz into the action, we And 
that the solution of the equations of IID SUGRA for individual radii are 

R,{t) = Lp{t/tor (6.13) 

where 

= '^Pi = 1- (6-14) 

Note that the equation (6.14) implies that at least one of the pi is negative. 
We have restricted attention to the case where the volume expands as time 
goes to infinity. We will see below that, although the equations are time 
reversal invariant, all of these solutions visit two very different regions of 
moduli space at the two endpoints of their evolution. One of the regions has 
a simple semiclassical description, while the other does not. This introduces 
a natural arrow of time into the system - the future is identified as the regime 
where the semiclassical approximation becomes better and better. 

It is well known that all of these solutions are singular at both infinite and 
zero time. Some of the radii shrink to zero at both ends of the evolution. 
Note that if we add a matter or radiation energy density to the system 
then it dominates the system in the infinite volume limit and changes the 
solutions for the geometry there. However, near the singularity at vanishing 
volume both matter and radiation become negligible (despite the fact that 
their densities are becoming infinite) and the solutions retain their Kasner 
form. 

All of this is true in IID SUGRA. In M-theory we know that many 
regions of moduli space which are apparently singular in IID SUGRA 
can be reinterpreted as living in large spaces described by weakly coupled 
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Type II string theory or a dual version of 1 ID SUGRA. The vacuum Einstein 
equations are of course invariant under these U-duality transformations. So 
one is lead to believe that many apparent singularities of the Kasner uni- 
verses are perfectly innocuous. 

Note however that phenomenological matter and radiation densities 
which one might add to the equations are not invariant under duality. The 
energy density truly becomes singular as the volume goes to zero. How then 
are we to understand the meaning of the duality symmetry? The resolution 
is as follows. We know that when radii go to zero, the effective field the- 
ory description of the universe in IID SUGRA becomes singular due to the 
appearance of new low frequency states. We also know that the singularity 
in the energy densities of matter and radiation implies that scattering cross 
sections are becoming large. Thus, it seems inevitable that phase space 
considerations will favor the rapid annihilation of the existing energy den- 
sities into the new light degrees of freedom. This would be enhanced for 
Kaluza-Klein like modes, whose individual energies are becoming large near 
the singularity. 

Thus, near a singularity with a dual interpretation, the contents of the 
universe will be rapidly converted into new light modes, which have a com- 
pletely different view of what the geometry of space is. The most effective 
description of the new situation is in terms of the transformed moduli and 
the new light degrees of freedom. The latter can be described in terms of 
fields in the reinterpreted geometry. We want to emphasize strongly the 
fact that the moduli do not change in this transformation, but are merely 
reinterpreted. This squares with our notion that they are exact concepts in 
M-theory. By contrast, the fields whose zero modes they appear to be in a 
particular semiclassical regime, do not always make sense. The momentum 
modes of one interpretation are brane winding modes in another and there 
is no approximate way in which we can consider both sets of local fields at 
the same time. Fortunately, there is also no regime in which both kinds of 
modes are at low energy simultaneously, so in every regime where the time 
dependence is slow enough to make a low energy approximation, we can use 
local field theory. 

This mechanism for resolving cosmological singularities leads naturally 
to the question of precisely which noncompact regions of moduli space can 
be mapped into what we will call the safe domain in which the theory can 
be interpreted as either IID SUGRA or Type II string theory with radii 
large in the appropriate units. 

6.3 The moduli space of M-Theory on rectangular tori 

In this section, we will study the structure of the moduli space of M-theory 
compactified on various tori with fc < 10. We are especially interested in 
noncompact regions of this space which might represent either singularities 
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or large universes. As explained above, the three-form potential Amnp will 
be set to zero and the circumferences of the cycles of the torus will be 
expressed as the exponentials 



^ = z=l,2,...,k. 

L/p 



(6.15) 



The remaining coordinates (time) and are considered to be 

infinite and we never dualize them. It is important to distinguish the 
variable s here from the time in the Kasner solution. Here we are just 
parametrizing possible asymptotic domains in the moduli space, whereas 
the Kasner solution is to be used as a metric valid for all values of the 
parameter t. We will see that it interpolates between two very different 
asymptotic domains. 

The radii are encoded in the logarithms pi. We will study limits of the 
moduli space in various directions which correspond to keeping pi fixed and 
sending s ^ oo (the change to s ^ 0 is equivalent to pi —pi so we do 
not need to study it separately). In terms of this parametrization of the 
extreme regions of moduli space, we can see that a Kasner solution with 
parameters pi will visit the regime of moduli space characterized by pi as 
t ^ oo and the regime —pi as t ^ 0. 



6.4 The 2/5 transformation 

M-theory has dualities which allow us to identify the vacua with different 
Pi’s. A subgroup of this duality group is the Sk which permutes the p/s. 
Without loss of generality, we can assume that pi < P 2 < • ■ • < Pio- We will 
assume this in most of the text. The full group that leaves invariant recti- 
linear tori with vanishing three form is the Weyl group of the noncompact 
Ek group of SUGRA. We will denote it by Gfe. We will give an elementary 
derivation of the properties of this group for the convenience of the reader. 
Gfe is generated by the permutations of the cycles on the torus, and one 
other transformation which acts as follows: 

{pi,P2,...,Pk) ^ (^Pl - y,P2 - y,P3 - y,P4 -f |, . .. ,Pfc -f | 

(6.16) 

where s = {pi + P 2 Eps)- Before explaining why this transformation is a 
symmetry of M-theory, let us point out several of its properties (6.16). 

• The total sum S = Yl!i=iPi changes to S ^ S + {k — 9)s/3. So for 
k < 9, the sum increases if s < 0, for k = 9 the total sum is an 
invariant and for k > 9 the sum decreases for s < 0; 

• If we consider all p/s to be integers which are equal modulo 3, this 
property will hold also after the 2/5 transformation. The reason is 
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that, due to the assumptions, s is a multiple of three and the coeffi- 
cients —2/3 and -1-1/3 differ by an integer; 

• As a result, from any initial integer piS we get piS which are multiples 
of 1/3 which means that all the matrix elements of matrices in the 2/5 
transformation are integer multiples of 1/3; 

• The order of Pi,P 2 ,P 3 is not changed (the difference pi — p 2 remains 
constant, for instance). Similarly, the order of p 4 ,p 5 ,..., pk is un- 
changed. However the ordering between pi .,,3 and P 4 ...k can change in 
general. By convention, we will follow each 2/5 transformation by a 
permutation which places the pi’s in ascending order; 

• The bilinear quantity / = (9 — k)J2{Pi) + = (10 — k) X(P?) + 

PiPj 1® invariant by Gfe. 

The fact that 2/5 transformation is a symmetry of M-theory can be proved 
as follows. Let us interpret Li as the M-theoretical circle of a Type IIa 
string theory. Then the simplest duality which gives us a theory of the 
same kind (IIa) is the double T-duality. Let us perform it on the circles 
L2 and L 3 . The claim is that if we combine this double T-duality with a 
permutation of L2 and L 3 and interpret the new Li as the M-theoretical 
circle again, we get precisely (6.16). 

Another illuminating way to view the transformation 2/5 transformation 
is to compactify M-theory on a three torus. The original M2-brane and the 
M5-brane wrapped on the three torus are both BPS membranes in eight 
dimensions. The tension of the original M2-brane is of order L~^, while 
that of the membrane which comes from the wrapped M5 is VL~^ where 
V is the volume of the three torus. When the three torus is large and 
the IID SUGRA approximation is valid, the wrapped M5-brane is much 
heavier than the M2-brane, while in the small volume limit, the opposite 
is true. We have seen previously that in limits where classical geometrical 
descriptions are breaking down, one can find a new classical description by 
following those BPS states which become lightest in the limit. This suggests 
that we try to define Ip^ = VL~^ and V = L~Hp and try to imagine 
a duality transformation in M-theory which takes a compactification on a 
small three torus to a compactification on a large one, with corresponding 
redefinition of the Planck scale. Aharony [38] has given arguments that such 
a duality transformation exists, and it can be demonstrated rigorously in 
Matrix Theory. In the limit in which one of the cycles of the is small, so 
that a type II string description becomes appropriate, it is just the double 
T-duality of the previous paragraph. The fact that this transformation plus 
permutations generates Gk was proven by the authors of [39] for k < 9. 
I leave it to the reader to verify that the effect of this transformation on the 
variables pi is precisely that described above. 
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In the following subsection we will use this group of duality transforma- 
tions to prove that extreme regions of the moduli space fall into a number 
of distinct categories. One is such that some kind of semiclassical descrip- 
tion of the physics is valid, and breaks up into regions that are described 
by IID SUGRA or weakly coupled Types IIa or IIb string theory. The 
other is completely mysterious and has no known semiclassical description. 
Each Kasner solution visits both of these regions at the extreme ends of 
its trajectory. It is thus reasonable to identify the past with the unknown 
region and the future with the semiclassical regime. 

The derivations below are based primarily on elementary algebra and 
the definition of the duality transformations given above. However, many 
cosmologists may want to skip the technical details. 

6.5 The boundaries of moduli space 

There are three types of boundaries of the toroidal moduli space which 
are amenable to detailed analysis. The first is the limit in which eleven- 
dimensional supergravity becomes valid. We will denote this limit as IID. 
The other two limits are weakly coupled Types IIa and IIb theories in 10 
dimensions. We will call the domain of asymptotic moduli space which can 
be mapped into one of these limits, the safe domain. 

• For the limit IID, all the radii must be greater than Lp. Note that 
for t ^ oo it means that all the radii are much greater than Lp. In 
terms of the pds, this is the inequality pi > 0; 

• For Type IIa, the dimensionless coupling constant must be 
smaller than 1 (much smaller for t —>■ oo) and all the remaining radii 
must be greater than Lg (much greater for t — > oo); 

• For Type IIb, the dimensionless coupling constant must be 
smaller than 1 (much smaller for t —>■ oo) and all the remaining radii 
must be greater than Lg (much greater for t —>■ oo), including the extra 
radius whose momentum arises as the number of wrapped M2-branes 
on the small in the dual IID SUGRA picture. 

If we assume the canonical ordering of the radii, i.e. pi < P 2 P 3 "L ■ ■ ■ "L 
Pk, we can simplify these requirements as follows: 

• IID: 0 < pi; 

• IIA: Pi < 0 < Pi -I- 2p2] 



• IIB: Pi -I- 2p2 < 0 < Pi -I- 2p3. 
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To derive this, we have used the familiar relations: 




(6.17) 



for the IID/IIA duality (Li is the M-theoretical circle) and similar relations 
for the IID/IIB case (Li < L 2 are the parameters of the and Tub is the 
circumference of the extra circle): 




1 
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Lub 







(6.18) 

(6.19) 



Note that the regions defined by the inequalities above cannot overlap, since 
the regions are defined by M, Af^nA, A'^DB where means the complement 
of a set. Furthermore, assuming pi < pi+i it is easy to show that pi+2ps < 0 
implies pi + 2 p 2 < 0 and pi + 2 p 2 < 0 implies 3pi < 0 or pi <0. 

This means that (neglecting the boundaries where the inequalities are 
saturated) the region outside IID U IIA U IIB is defined simply by pi 
+2p3 < 0. The latter characterization of the safe domain of moduli space 
will simplify our discussion considerably. 

The invariance of the bilinear form defined above gives an important 
constraint on the action of Gfe on the moduli space. For A: = 10 it is easy to 
see that, considering the pi to be the coordinates of a ten vector, it defines 
a Lorentzian metric on this ten dimensional space. Thus the group Gio is 
a discrete subgroup of 0(1,9). The direction in this space corresponding 
to the sum of the pi is timelike, while the hyperplane on which this sum 
vanishes is spacelike. We can obtain the group Gg from the group Gio 
by taking pio to infinity and considering only transformations which leave 
it invariant. Obviously then, Gg is a discrete subgroup of the transverse 
Galilean group of the infinite momentum frame. For fc < 8 on the other 
hand, the bilinear form is positive definite and Gfe is contained in 0{k). 
Since the latter group is compact, and there is a basis in which the Gfc 
matrices are all integers divided by 3, we conclude that in these cases Gfc 
is a finite group. In a moment we will show that Gg and a fortiori Gio 
infinite. Finally we note that the 2/5 transformation is a spatial reflection in 
0(1,9). Indeed it squares to 1 so its determinant is ±1. On the other hand, 
if we take all but three coordinates very large, then the 2/5 transformation 
of those coordinates is very close to the spatial reflection through the plane 
Pi+P 2 +P 3 = 0, so it is a refiection of a single spatial coordinate. 

We now prove that Gg is infinite. Start with the first vector of pfs given 
below and iterate (6.16) on the three smallest radii (a strategy which we 
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Fig. 3. The structure of the moduli space for T^. 



will use all the time) and sort pts after each step, so that their index reflects 
their order on the real line. We get 







T 

to 
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to 
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to 


1 

to 
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to 
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to 


1 

1 




(-5, -5, -5, 


1 

to 

1 

to 

1 

to 


(3 X (2-3n), 


3x (-1), 


(3 X (1 - 3n), 


3x (-2), 



+ 1 , + 1 , + 1 ) 

+ 2 , + 2 , + 2 ) 

+4, +4, +4) (0 20) 

3 X (3n-4)) 

3 X (3n- 2)) 



so the entries grow (linearly) to inflnity. 



6. 6 Covering the moduli space 

We will show that there is a useful strategy which can be used to transform 
any point {pi} into the safe domain in the case of T^, /c < 9. The strategy 
is to perform iteratively 2/5 transformations on the three smallest radii. 

Assuming that {pi} is outside the safe domain, i.e. p\ + 2ps < 0 {pi’s 
are sorted so that pi < Pi+i), it is easy to see that P1+P2+P3 < 0 (because 
P 2 < ps). As we said below the equation (6.16), the 2/5 transformation on 
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PuP 2 ,P 3 always increases the total sum ^Pi for pi + P 2 + Ps < 0. But this 
sum cannot increase indefinitely because the group Gk is finite for k < 9. 
Therefore the iteration proccess must terminate at some point. The only 
way this can happen is that the assumption pi + 2p^ < 0 no longer holds, 
which means that we are in the safe domain. This completes the proof 
for k <9. 

For k = 9 the proof is more difficult. The group Gg is infinite and 
furthermore, the sum of all pi’s does not change. In fact the conservation 
of ^ Pi is the reason that only points with ^pi > 0 can be dualized to the 
safe domain. The reason is that if pi + 2p^ > 0, also 3pi + d>pz > 0 and 
consequently 



Pl+P2+P3+PA+Pb+P&+P7 +P&+ P9 

> Pi + Pi + Pi + P3 + P3 + P3 + P3 + P3 + P3 > Q- (6-21) 

This inequality is saturated only if all pi’s are equal to each other. If their 
sum vanishes, each pi must then vanish. But we cannot obtain a zero vector 
from a nonzero vector by 2/5 transformations because they are nonsingular. 
If the sum ^Pi is negative, it is also clear that we cannot reach the safe 
domain. 

However, if ^ then we can map the region of moduli space 

with t ^ oo to the safe domain. We will prove it for rational pi’s only. 
This assumption compensates for the fact that the order of Gg is infinite. 
Assuming pi’s rational is however sufficient because we will see that a finite 
product of 2/5 transformations brings us to the safe domain. But a compo- 
sition of a finite number of 2/5 transformations is a continuous map from 
M to M so there must be at least a “ray” part of a neighborhood which 
can be also dualized to the safe domain. Because is dense in our 
argument proves the result for general values of pi . 

From now on we assume that the pi’s are rational numbers. Everything 
is scale invariant so we may multiply them by a common denominator to 
make integers. In fact, we choose them to be integer multiples of three since 
in that case we will have integer pi’s even after 2/5 transformations. The 
numbers pi are now integers equal modulo 3 and their sum is positive. We 
will define a critical quantity 



1...9 

C = Y,{p,-p,r. ( 6 . 22 ) 

i<j 

This is a priori an integer greater than or equal to zero which is invariant 
under permutations. What happens to C if we make a 2/5 transformation 
on the radii p \ , P 2 , P 3 ? The differences pi—p 2 , Pi —Ps, P 2 ~P 3 do not change 
and this holds for p 4 — ps, ■ ■ - Ps — pg, too. The only contributions to (6.22) 
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which are changed are from 3 • 6 = 18 “mixed” terms like {pi — Pi)^. Using 
(6.16), 



{Pl -P4) 




= {Pl -P4) -S 



(6.23) 



so its square 

(pi - P4)^ [(pi - P 4 ) - = (pi - P 4 )^ - 2s(pi - P 4 ) + (6.24) 

changes by — 2s(pi — P 4 ) + s^. Summing over all 18 terms we get (s = 
Pl +P 2 +Ps) 

AU = — 2s[6(pi + P 2 + Ps) ~ 3 (P 4 + . . . + Pg)] + 18s^ 



= 6s^ + 6 = 6 s^Pj. (6.25) 

But this quantity is strictly negative because ^ pi is positive and s < 0 (we 
define the safe domain with boundaries, pi + 2ps > 0). 

This means that C defined in (6.22) decreases after each 2/5 transfor- 
mation on the three smallest radii. Since it is a non-negative integer, it 
cannot decrease indefinitely. Thus the assumption pi -|- 2p3 < 0 becomes 
invalid after a finite number of steps and we reach the safe domain. 

Now let us turn to the fully compactified case. As we pointed out, the 
bilinear form 1=2 PiPj defines a Lorentzian signature metric on the 
vector space whose components are the pi. The 2/5 transformation is a 
spatial reflection and therefore the group Gio consists of orthochronous 
Lorentz transformations. Now consider a vector in the safe domain. We 
can write it as 



{-2,-2 + ai,l + a 2 ,...,l + ag)S, S (6.26) 

where the Ui are positive. It is easy to see that / is positive on this config- 
uration. This means that only the inside of the light cone can be mapped 
into the safe domain. Furthermore, since is positive in the safe domain 
and the transformations are orthochronous, only the interior of the future 
light cone in moduli space can be mapped into the safe domain. 

We would now like to show that the entire interior of the forward light 
cone can be so mapped. We use the same strategy of rational coordinates 
dense in M .If we start outside the safe domain, the sum of the first three 
Pi is negative. We again pursue the strategy of doing a 2/5 transformation 
on the first three coordinates and then reordering and iterating. For the case 
of Gg the sum of the coordinates was an invariant, but here it decreases 
under the 2/5 transformation of the three smallest coordinates, if their sum 
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is negative. But ^ pi is (starting from rational values and rescaling to 
get integers congruent modulo three as before) a positive integer and must 
remain so after Gio operations. Thus, after a finite number of iterations, 
the assumption that the sum of the three smallest coordinates is negative 
must fail, and we are in the safe domain. In fact, we generically enter the 
safe domain before this point. The complement of the safe domain always 
has negative sum of the first three coordinates, but there are elements in 
the safe domain where this sum is negative. 

It is quite remarkable that the bilinear form / is proportional to the 
Wheeler-De Witt Hamiltonian for the Kasner solutions: 




The solutions themselves thus lie precisely on the future light cone in moduli 
space. Each solution has two asymptotic regions (t ^ 0,oo in (6.12)), one 
of which is in the past light cone and the other in the future light cone of 
moduli space. The structure of the modular group thus suggests a natural 
arrow of time for cosmological evolution. The future may be defined as the 
direction in which the solution approaches the safe domain of moduli space. 
All of the Kasner solutions then, have a true singularity in their past, which 
cannot be removed by duality transformations. 

Actually, since the Kasner solutions are on the light cone, which is the 
boundary of the safe domain, we must add a small homogeneous energy 
density to the system in order to make this statement correct. The con- 
dition that we can map into the safe domain is then the statement that 
this additional energy density is positive. Note that in the safe domain, 
and if the equation of state of this matter satisfies (but does not saturate) 
the holographic bound of [36], this energy density dominates the late time 
evolution of the universe, while near the singularity, it becomes negligible 
compared to the Kasner degrees of freedom. The assumption of a homo- 
geneous negative energy density is manifestly incompatible with Einstein’s 
equations in a compact fiat universe so we see that the spacelike domain of 
moduli space corresponds to a physical situation which cannot occur in the 
safe domain. 

The backward lightcone of the asymptotic moduli space is, as we have 
said, visited by all of the classical solutions of the theory. 

To summarize: the U-duality group Gio divides the asymptotic domains 
of moduli space into three regions, corresponding to the spacelike and fu- 
ture and past timelike regimes of a Lorentzian manifold. Only the future 
lightcone can be understood in terms of weakly coupled SUGRA or string 
theory. The group theory provides an exact M-theoretic meaning for the 
Wheeler-De Witt Hamiltonian for moduli. Classical solutions of the low 
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energy effective equations of motion with positive energy density for mat- 
ter distributions lie in the timelike region of moduli space and interpolate 
between the past and future light cones. We find it remarkable that the 
purely group theoretical considerations of this section seem to capture so 
much of the physics of toroidal cosmologies. 

6. 7 Moduli spaces with less SUSY 

We would like to generalize the above considerations to situations which 
preserve less SUSY. This enterprise immediately raises some questions, the 
first of which is what we mean by SUSY. Cosmologies with compact spatial 
sections have no global symmetries in the standard sense since there is no 
asymptotic region in which one can define the generators. We will define 
a cosmology with a certain amount of SUSY by first looking for Euclidean 
ten manifolds and three form field configurations which are solutions of the 
equations of IID SUGRA and have a certain number of Killing spinors. 
The first approximation to cosmology will be to study motion on a moduli 
space of such solutions. The motivation for this is that at least in the semi- 
classical approximation we are guaranteed to find arbitrarily slow motions 
of the moduli. In fact, in many cases, SUSY nonrenormalization theorems 
guarantee that the semiclassical approximation becomes valid for slow mo- 
tions because the low energy effective Lagrangian of the moduli is to a 
large extent determined by SUSY. There are however a number of pitfalls 
inherent in our approach. We know that for some SUSY algebras, the mod- 
uli space of compactifications to four or six dimensions is not a manifold. 
New moduli can appear at singular points in moduli space and a new branch 
of the space, attached to the old one at the singular point, must be added. 
There may be cosmologies which traverse from one branch to the other in 
the course of their evolution. If that occurs, there will be a point at which 
the moduli space approximation breaks down. Furthermore, there are many 
examples of SUSY vacua of M-theory which have not yet been continuously 
connected on to the IID limit, even through a series of “conifold” transi- 
tions such as those described above [41]. In particular, it has been suggested 
that there might be a completely isolated vacuum state of M-theory [42] . 
Thus it might not be possible to imagine that all cosmological solutions 
which preserve a given amount of SUSY are continuously connected to the 
IID SUGRA regime. 

Despite these potential problems, we think it is worthwhile to begin a 
study of compact, SUSY preserving, ten manifolds. Here we will only study 
examples where the three form field vanishes. The well known local condi- 
tion for a Killing spinor, D^e = 0, has as a condition for local integrability 
the vanishing curvature condition 

Rfulabi = 0 . 



(6.28) 
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Thus, locally the curvature must lie in a subalgebra of the Lie algebra 
of Spin{lO) which annihilates a spinor. The global condition is that the 
holonomy around any closed path must lie in a subgroup which preserves 
a spinor. Since we are dealing with IID SUGRA, we always have both the 
16 and 16 representations of Spin{10) so SUSYs come in pairs. 

For maximal SUSY the curvature must vanish identically and the space 
must be a torus. The next possibility is to preserve half the spinors and 
this is achieved by manifolds of the form AT3 x or or bifolds of them by 

freely acting discrete symmetries. 

We now jump to the case of 4 SUSYs. To find examples, it is convenient 
to consider the decompositions Spin{10) U Spin{k) x Spin{\Q — k). 

The 16 is then a tensor product of two lower dimensional spinors. For 
fc = 2, the holonomy must be contained in S'C/(4) C Spin(8) in order to 
preserve a spinor, and it then preserves two (four once the complex conjugate 
representation is taken into account). The corresponding manifolds are 
products of Calabi-Yau fourfolds with two tori, perhaps identified by the 
action of a freely acting discrete group. This moduli space is closely related 
to that of F-theory compactifications to four dimensions with minimal four 
dimensional SUSY. The three spatial dimensions are then compactified on 
a torus. For k = 2> the holonomy must be in G 2 C Spin(7). The manifolds 
are, up to discrete identifications, products of Joyce manifolds and three 
tori. For fc = 4 the holonomy is in SU{2) x SU{3). The manifolds are 
free orbifolds of products of Calabi-Yau threefolds and K3 manifolds. This 
moduli space is that of the heterotic string compactified on a three torus and 
Calabi-Yau three-fold. The case k = 5 does not lead to any more examples 
with precisely 4 SUSYs. 

It is possible that M-theory contains U-duality transformations which 
map us between these classes. For example, there are at least some ex- 
amples of F-theory compactifications to four dimensional Minkowski space 
which are dual to heterotic compactifications on threefolds. After further 
compactification on three tori we expect to find a map between the k = 2 
and k = 4: moduli spaces. 

It is clear that the metric on the full moduli space still has Lorentzian 
signature in the SUGRA approximation. In some of these cases of lower 
SUSY, we expect the metric to be corrected in the quantum theory. How- 
ever, we do not expect these corrections to alter the signature of the metric. 
To see this note that each of the cases we have described has a two torus 
factor. If we decompactify the two torus, we expect a low energy field 
theoretic description as three dimensional gravity coupled to scalar fields 
and we can perform a Weyl transformation so that the coefficient of the 
Einstein action is constant. The scalar fields must have positive kinetic en- 
ergy and the Einstein term must have its conventional sign if the theory is 
to be unitary. Thus, the decompactified moduli space has a positive metric. 
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In further compactifying on the two torus, the only new moduli are those 
contained in gravity, and the metric on the full moduli space has Lorentzian 
signature. 

Note that as in the case of maximal SUSY, the region of the moduli space 
with large ten volume and all other moduli held fixed, is in the future light 
cone of any finite point in the moduli space. Thus we suspect that much of 
the general structure that we uncovered in the toroidal moduli space, will 
survive in these less supersymmetric settings. 

The most serious obstacle to this generalization appears in the case of 4 
(or fewer) supercharges. In that case, general arguments do not forbid the 
appearance of a potential in the Lagrangian for the moduli. Furthermore, 
at generic points in the moduli space one would expect the energy density 
associated with that potential to be of order the fundamental scales in the 
theory. In such a situation, it is difficult to justify the Born-Oppenheimer 
separation between moduli and high energy degrees of freedom. Typical 
motions of the moduli on their potential have frequencies of the same order 
as those of the ultraviolet degrees of freedom. In Section 7 we will try to 
present a solution to this conundrum. 

6.8 Chaotically avoiding SUSY 

The considerations of this section also allow us to achieve some insight 
into the problem of why M-theory has not chosen to sit in one of its stable 
highly supersymmetric vacua in the world we observe. The discussion which 
follows is completely rigorous on the branches of moduli space with 16 or 
more SUSYs. It is probably valid for 8 SUSYs as well, for in that case the 
moduli space exists although its topology and metric are not determined 
by classical considerations. Nonetheless, all known extreme regions of the 
moduli space have the properties we will use below. 

The key point is that our analysis of extreme regions of moduli space 
showed a monotonic flow from the unsafe to the safe regions. We have 
neglected extreme regimes corresponding to partial decompactiflcation, and 
also the motion of the other moduli, and of the non modular degrees of 
freedom which surely dominate the energy density in regimes where the 
universe has expanded a lot. In fact, inclusion of these other degrees of 
freedom reinforces the conclusion that the universe will always end up in 
the safe domain. 

Horne and Moore [63] have shown that motion on the full moduli space 
(as opposed to its Kasner subspace) is chaotic. Furthermore, the Euclidean 
metric on the subspace of moduli with unit spatial volume has finite volume 
in the metric on moduli space, which means that the extreme regions of this 
space (which correspond to partial decompactiflcations) have vanishingly 
small measure. The chaotic nature of the motion, as well as the fact that 
the moduli are, at least at late times, coupled to a stochastic radiation bath. 
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imply that the generic cosmological solution will in fact sample regions of 
the moduli space in proportion to the measure defined by the kinetic energy 
of the moduli. In particular, partial decompactifications, which are of of 
measure zero on the moduli space, will not be generic final states of the 
cosmological evolution. 

We conclude that the generic cosmological solution in these supersym- 
metric regions of the moduli space will asymptote to a ten or eleven dimen- 
sional universe filled with radiation. All of low energy physics is weakly 
coupled, there are no finite energy scales apart from the Planck or string 
scales, and there are no apparent candidates for long lived nonrelativistic 
particles^®. It seems safe to conclude that none of these model universes 
could ever contain galaxies. Thus, if we are willing to entertain the very 
weak form of the anthropic principle which claims that galaxies are neces- 
sary for intelligent life, we can find an explanation of why we do not live in 
a universe with 8 or more SUSYs. 

I do not claim to find this a completely satisfactory resolution of the 
question. On the one hand, I maintain that this sort of use of anthropic 
reasoning is scientifically valid. That is, we appear, in M-theory, to be 
faced with a model of physics which predicts the possibility of alternate 
universes which do not resemble what we observe. I have tried to give an 
honest account of what happens to a generic universe of this sort (within 
the class with maximal SUSY) and found that it lacks what would appear 
to be a very weak requirement for the existence of life. I did not have 
to speculate about unknown results in extra universal biology to come to 
this conclusion. On the other hand, one might wish for a sharper distinction 
between our own universe and these unobservable ones. Wouldn’t it be nicer 
if they all suffered some sort of satisfyingly final cosmic catastrophe and sank 
back into the ultraviolet muck of creation?^^ Or perhaps one could, with a 
more comprehensive knowledge of M-theory, argue that generic cosmological 
solutions of the whole theory do not end up in the maximally SUSY regions. 

One direction in which to search for such an argument has to do with in- 
flation. I have purposely avoided mentioning that cosmologies which remain 
on the moduli spaces with 8 or more SUSYs cannot inflate. The obvious 
retort to such a remark is that inflation could have occurred somewhere 
else in configuration space, and the system could then have rolled down to 
the moduli space. One cannot investigate the probability of such a motion 
without a much more thorough understanding of M-theory than we now 
possess. So the galactothropic explanation of the absence of SUSY ground 



do not see any source for a population of large and therefore long lived black holes. 
^^We will see something of the sort happening to another class of undesirable universes 
in the next section. 
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states is the best we can do at the moment. Perhaps it will be the best we 
can ever do. 

6.9 Against inflation 

To an audience of astroparticle physicists the suggestion that inflation might 
not be a necessary feature of our explanation of the universe is akin to heresy. 
I therefore thought it would be amusing to insert some speculations here 
about alternative ways to solve the cosmological conundra which led to the 
invention of inflation. Those of my readers who actually attended these 
lectures will not that I did not actually present this material. Let me assure 
you that it was only for lack of time, and not because I was afraid of being 
mauled by an angry crowd of true believers. 

To begin our trek down the path of heterodoxy let me attack the common 
wisdom about the horizon problem. This is the observation that in conven- 
tional Big Bang cosmology, the horizon at early times is much smaller than 
the backward extrapolation of our current horizon. Thus, regions of the 
universe that we can observe today were out of causal contact. How one 
then asks can their contents be in thermal equilibrium at a uniform tem- 
perature? I would like to contend that the M theorist’s answer might be 
“very easily” . Local held theory is only an approximation to M-theory. At 
sufficiently high energies it is clear that locality breaks down in some way. 
The typical high energy state in perturbative string theory is an extremely 
long single string. Beyond the perturbative approximation, large branes of 
other dimensions may be relevant. Although brane interactions are local 
on the brane {e.g. strings split and join at points in spacetime) this does 
not seem to be an argument which forces one to conclude that the correct 
state of the string is unlikely to be a typical member of the ensemble of 
strings with given energy (as one argues for a quantum field theory in a 
Big Bang cosmology when one says that the fields in causally disconnected 
regions have not had a chance to thermalize). If the system is in thermal 
equilibrium at very high energy, and if the expansion is slow enough, then 
it will remain at equilibrium at lower energies. 

Another argument against naive locality at the fundamental scale (which 
might be much lower than 10^® GeV) has to do with black holes. Once the 
typical energy and impact parameter in particle collisions are such that 
black hole formation is common, the spacetime geometry is distorted in a 
way which modifies the naive causality arguments. If we believe that black 
hole evaporation is a unitary process, then standard causality arguments are 
only valid outside black hole horizons (I am assuming that if the universe is 
closed, then its radius is much larger than the relevant black hole horizons) . 
All states associated with a given black hole are in thermal equilibrium with 
each other, and black holes will tend to coalesce, bringing more and more 
of the system into equilibrium. 
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The claim then is that the horizon problem is not a problem (I am 
being deliberately provocative here - I don’t know whether I believe these 
arguments). Rather, the principle of thermodynamic equilibrium i.t. that 
systems tend to be in typical states consistent with their energy content is 
more fundamental than the causality principle applied to a simple averaged 
classical geometry and a model of its matter content as localized particles 
interacting via local field theory. 

Similar remarks apply to the monopole problem, at least in those re- 
gions of moduli space where there is no grand unified group below the 
fundamental scale. Monopoles then belong to the high energy theory, and 
the conventional field theoretic estimates (again based on causality) of their 
abundance are incorrect. 

One can make an even more convincing attack on the arguments for 
the flatness problem. This puzzle is based on the model of a homogeneous 
isotropic universe. This should properly be regarded as a phenomenologi- 
cal model rather than a fundamental starting point for cosmology. Indeed 
although descriptions of inflationary cosmology usually start from standard 
Robertson- Walker ideology, they in fact reject that ideology. Homogene- 
ity and isotropy arise as late time fixed point behavior. However, if one is 
going to start from more general initial conditions, one can get rid of the 
flatness problem in a simpler way. Indeed, I have argued above that a more 
fundamentally motivated approach to cosmology might start from geome- 
tries (and configurations of other fields) on a moduli space of static classical 
solutions of the SUGRA equations. It is a generic feature of such models, 
that unless the energy density is allowed to be negative, the universe evolves 
monotonically toward large volume. Thus spatial curvatures (and many of 
the Calabi-Yau manifolds on these moduli spaces are curved) are generally 
evolving towards zero without any fine tuning. The general cosmological 
solution for motion on a moduli space of geometries, coupled to positive 
energy density matter evolves toward zero spatial curvature if we are only 
willing to wait long enough. The only real issue left among the conventional 
cosmological puzzles is the Entropy Problem. 

To explain this in more detail let us consider a simple example of the 
kind of model we are discussing. Consider the moduli space of solutions of 
weakly coupled heterotic string theory compactifled on a three torus large 
compared to the string scale, times a Calabi-Yau threefold. Let us agree 
to ignore the phenomenological problems with the dilaton which make this 
regime problematic as a model of the real world. The Friedmann equation 
for this model has the form 

m|(a/a)^ = mp[&/a® -I- dja^ -I- + A]. (6.29) 

a is the scale factor of the three torus, and b, d, e and A represent the 
contributions to the energy density of the moduli, radiation, nonrelativistic 
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matter, and a cosmological constant, all measured in Planck units. We 
choose conventions such that a = 1 is the present scale factor. The volume 
of the torus is a^Vo, where Vq is the volume today. Observation tells us 
that the periods of the three torus are of the same order as, or larger than 
our horizon volume, whose size is 10®° Planck units. We neglect processes 
which convert one form of energy density into another and do not attempt to 
explain why all of the constants d, e and A are within an order of magnitude 
or so of each other. 

The moduli of the torus are the ratios of its periods, the angles between 
the different toroidal directions, Wilson lines for the heterotic gauge fields 
and “Wilson two surfaces” for the antisymmetric tensor potential of het- 
erotic string theory. These evolve as a nonlinear sigma model of Goldstone 
type. The analysis of [25] implies that motion on this space stops early in 
the history of the universe, its kinetic energy being converted into a gas 
of momentum modes of the corresponding fields, which contributes to the 
constant d. The torus then expands indefinitely with fixed shape. Thus, 
if we wait long enough, all remnant of the finiteness of space is wiped out, 
without fine tuning of initial conditions. In cases where the moduli space in 
question is a family of curved Calabi-Yau spaces, the same analysis applies 
and the spatial curvature is erased without any fine tuning. 

The real difficulty for this solution of the flatness problem is simply that 
if we wait long enough for spatial finiteness and curvature to be stretched 
away, there may not be enough matter and radiation in our model to ac- 
count for the universe we observe. This is what is commonly referred to 
as the Entropy problem in the literature of inflationary cosmology. The 
models we are discussing show that it is logically separate from the flatness 
problem, which is rather specific to homogeneous isotropic models, where 
the spatial geometry at each instant is not a static solution of the Einstein 
equations. In these models, generic initial values for curvature would have 
long ago led to a curvature dominated regime of expansion and substantially 
modified much of cosmic history. In models based on moduli, generic initial 
conditions would not have changed the expansion rate very much at late 
times and would probably not show up in local physics. Their discrepancy 
with observation would simply come from the absence of evidence for global 
structure or anisotropy in the background geometry. 

Another way to phrase the Entropy Problem is the discrepancy between 
the universe’s energy content and its size at the “moment of the Big Bang” . 
If one follows the conventional Robertson- Walker cosmology back to the 
Planck energy density, then the linear size of our horizon volume at that 
time is 10^^ Planck units. The size of any closed universe would have to be 
larger than this. I do not have any explanation of this large pure number in 
the present context. In inflationary cosmology it is solved by creating the 
matter and radiation after a period of inflationary expansion. 
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At the level of the semiclassical analysis we have done, there does not 
seem to be any strong objection to such initial conditions. We have empha- 
sized that the semiclassical treatment of the moduli requires only that the 
volume of the universe be large. At energies above the Planck scale, there 
will be new terms in the equations of motion of the moduli representing 
their interaction with the full set of high energy degrees of freedom of M- 
theory. But in principle one could imagine following the evolution back to 
a Planck size for the whole universe before the semiclassical approximation 
breaks down. The statement of the Entropy Problem at that time would 
be that the energy density was many orders of magnitude higher than the 
Planck scale. Is there some principle which prevents this? 

It would be nice to find one, because one would like to have a clean reason 
for rejecting alternatives to inflation. Alternatively, it would be interesting 
to find an explanation of this large number, and to take the anti-inflationary 
cosmology more seriously. In the latter event one would be required to 
come up with an explanation for the fluctuations in the cosmic microwave 
background at least as convincing as that provided by inflationary models. 

What should the serious cosmologist take away from this discussion? 
I hardly hope or wish to convince anyone to abandon the inflationary 
paradigm. However, I think it is salutary to recognize that many of the 
theoretical arguments which one thinks of as the basic raison d’etre of infla- 
tionary cosmology, are on rather shaky ground in the light of current theory. 
The clear cut triumphs of inflation are reduced to two: the explanations of 
the entropy of the current universe and of the fluctuations in the microwave 
background. 

In the next section, we will abandon this heresy and pursue a more 
orthodox path. 



6.10 Conclusions 

We argued that the supersymmetric moduli of M-theory were the natural 
semiclassical variables which provide the clock for cosmology. Our argu- 
ment was based on the naive Wheeler-Dewitt quantization of gravity but we 
presented some evidence that the general structures assumed in that quan- 
tization were more robust than their derivation from a low energy effective 
theory would have led us to believe. We showed that duality transforma- 
tions resolve some but not all cosmological singularities, and provided a 
first draft of an argument for the absence of highly supersymmetric vacuum 
states of M-theory in the list of Natural Phenomena in the Real World. We 
also briefly explored a heterodox, noninflationary, approach to cosmology 
which resolves some but not all of the problems that inflation was invented 
to solve. 
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7 Moduli and inflation 

7.1 Introduction 

In this lecture we will finally start to discuss more realistic sectors of 
M theoretic cosmology. As I have warned you several times, this area is 
still under development and there is no justification for trying to build de- 
tailed models which can be compared to observation. Indeed, towards the 
end of my presentation I will describe my own favorite scenario for cosmol- 
ogy in M-theory. It turns out that its viability depends heavily on numerical 
factors of order one which cannot be reliably calculated at present. Such fac- 
tors in fundamental quantities have a tendency to get raised to high powers 
in a cosmological context {e.g. the widths of unstable states depend on the 
cube of their masses and the square of their couplings. These in turn might 
be estimated by formulae which depend on high powers of some fundamen- 
tal scale. Mistakes of order one can thus be amplified). Also, experience 
with weakly coupled string theory shows that order of magnitude estimates 
can miss factors like IGtt^. Our fundamental contention about M-theory is 
that neither the true vacuum state nor the point where inflation takes place 
are likely to sit in one of the weakly coupled or large radius regimes where 
systematic calculations can be done. Thus, we are unlikely to be able to 
extract detailed numbers from M-theory until we learn a lot more about the 
nonperturbative formulation of the theory. In this situation it seems wisest 
to try to investigate very general problems, and that is what we will try to 
do. I will deviate from this formula only towards the end of my lectures, in 
order to present the amusing scenario that I favor. 



7.2 Moduli as inflatons? 

In view of our discussion in the previous section, one might have thought 
that the appropriate title for this section was “Cosmology on the Moduli 
space with 4 SUSYs”. At first sight, the phrase in quotes does not appear 
to make any sense. M-theory has no global internal symmetries - all of its 
symmetries are residual gauge symmetries which leave some class of config- 
urations invariant^®. With only 4 SUSYs, supersymmetry alone permits a 



usual, there are two arguments for this, one based on SUGRA, the other on 
perturbative string theory. Their agreement is taken as evidence that the statement is 
exact. The SUGRA argument is simply that all symmetries of SUGRA are diffeomor- 
phisms, thus gauge symmetries. Global symmetries arise only as diffeomorphisms which 
leave invariant the asymptotic behavior of the noncompact portion of space. All other 
symmetries are gauged. In perturbative string theory an internal symmetry would arise 
as a symmetry of the superconformal field theory describing the internal space. One 
can show [40], that a continuous global symmetry implies the existence of a Kac-Moody 
current algebra in the superconformal field theory (basically just Noether’s theorem plus 
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superpotential on the space of chiral superfields. The full effective poten- 
tial is the sum of a term coming from the so-called D-terms of continuous 
gauge groups, and a term coming from the superpotential^®. The D-terms 
are positive, and the moduli space of fields on which they vanish can be 
parametrized in terms of gauge invariant composite fields. The superpoten- 
tial can be viewed as a function on this space. The only symmetries which 
act on the composites are discrete gauge symmetries'^. In most cases, a 
discrete symmetry cannot imply the vanishing of a function on an entire 
submanifold (we will explore the exception below). 

The apparent implication of this is that the phrase “moduli space of 
M-theory compactifications with 4 SUSYs” has no apparent meaning. There 
is no moduli space in the true sense of the word (with the exception noted 
in the last parenthesis). Nonetheless, the authors of [24] proposed and [25] 
and others explored the idea, that moduli of such compactifications were 
the natural inflaton candidates in string/M-theory. Note that inflatons, by 
their nature, must have a potential so the idea of moduli as inflatons is truly 
oxymoronic. 

However, I hope to demonstrate for you that this idea is not at all 
idiotic, and that it has many attractive features. The original proposals 
were based on string perturbation theory. Here the idea of a moduli space 
of quadrisusic^^ compactifications makes perfect mathematical sense. At 
string tree level, a vacuum state is characterized as a conformal field theory 
with certain extra properties. There is an exact theorem which guarantees 
the existence of continuous families of solutions to this constraint. The most 
famous among them are those which correspond to compactification of the 
heterotic string on a CY 3-fold with the standard embedding of the spin 
connection of the manifold in the gauge group. Here the theorem follows 
from the fact that the same conformal field theories can be used to com- 
pactify Type II string theories to four dimensions, preserving 8 spacetime 
SUSYs. The extra spacetime SUSY guarantees the existence of moduli. 
The heterotic and Type II theories compactified on these backgrounds dif- 
fer at the one loop level and beyond, and the heterotic theory has only 
4 SUSYs. Nonetheless, to all orders in the loop expansion, no superpo- 
tential is generated on the tree level moduli space in the heterotic theory. 
Indeed, the heterotic coupling, like a generic gauge coupling, can be viewed 
as the real part of a chiral superfield S = ^ + i0, whose imaginary part is an 



conformal invariance — up to technicalities). The Kac- Moody currents can be used to 
construct vertex operators for massless gauge bosons. 

^^See Keith Olive’s lectures at this school for a concise introduction to four dimensional 
SUSY, chiral superfields, superpotentials, D terms, etc. 

^^The only difference between gauged and nongauged discrete symmetries from a prac- 
tical point of view is the absence of stable domain walls for gauged discrete symmetries, 
recently rediscovered ancient Latin word meaning: having four supersymmetries. 
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axionlike field called the model independent string axion. This field arises by 
a duality transformation on a second rank antisymmetric tensor gauge field. 
As a consequence, to all orders in perturbation theory there is a continuous 
shift symmetry S S + ia. This symmetry, combined with holomorphy, 
forbids any perturbative correction to the superpotential. 

The idea behind most previous work on the subject was that the real 
world corresponds to a point in moduli space where the perturbative esti- 
mates of the superpotential were correct. The string coupling was supposed 
to correspond more or less to the perturbative gauge couplings we see in 
nature, or to be related to them by simple group theoretical factors. The su- 
perpotential on the perturbative moduli space was then much smaller than 
the fundamental scales of the theory, and it made sense to think about an 
approximate moduli space. 

This set of ideas had a number of related difficulties. The first was the 
Dine Seiberg problem [55] . These authors made the simple observation that 
for most functions, the leading asymptotic formula in some extreme region 
(here the weak coupling region) is monotonic and does not have minima^^. 
There have been two mechanisms proposed for stabilizing M-theory in the 
weak string coupling regime, which go under the names of Kahler stabiliza- 
tion [48] and racetrack models [4]. Both imply that, although the couplings 
are weak, many quantities cannot be calculated in a systematic expansion. 

A related cosmological problem with the weak coupling regime was 
pointed out by Brustein and Steinhardt [54]. There is a distinct possi- 
bility that the universe would “overshoot” a weak coupling minimum and 
evolve into the regime of extreme weak coupling where M-theory is in violent 
disagreement with observation. 

When combined with Witten’s analysis [44] of the possible resolution of 
the discrepancy in the weak coupling prediction of the ratio between the 
unification and Planck scales, these observations compel one to consider 
the possibility that weakly coupled string theory is not a good description 
of nature. A somewhat better starting point is the IID SUGRA analysis 
begun in [43] The analyses of [44] and [45] indicate that: 

• in the regime of moduli determined by the fit to the unified coupling 
strength and the four dimensional Planck mass, the volume of the 
Calabi-Yau manifold on the brane where the standard model lives is 
not really in the regime where the SUGRA expansion can be trusted. 
However, the small size of the four dimensional effective coupling, com- 
bined with holomorphy, is enough to guarantee the usual tree level 
unification relations between standard model couplings. This gives 



Exceptions to this are somewhat pathological. The leading asymptotic behavior 
could contain a factor sin(l/3^) which has an infinite number of more and more closely 
spaced minima as one approaches the weak coupling regime. 
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rise to a situation similar to that hypothesized in the Kahler stabi- 
lization mechanism, where holomorphic quantities can be calculated 
reliably but the Kahler potentials of chiral fields are unknown; 

• Witten’s hypothesis that the coupling of the gauge fields on the second 
brane is strong, and gives rise to a gaugino condensate whose magni- 
tude is of order the unification scale (which is also the fundamental 
IID Planck scale) induces too high a scale of SUSY breaking on the 
standard model brane. We will discuss a resolution of this problem 
below; 

• in the analysis of [45] the SUSY breaking F term comes from the 
modulus which parametrizes the radius of the single large dimension 
transverse to the Hofava-Witten ninebranes. To leading order in the 
SUGRA expansion this leads to no-scale SUSY breaking with vanish- 
ing cosmological constant, and also gives rise to degenerate squarks^^; 

• the radial mode is not stabilized in this approximation and we have a 
sort of Dine-Seiberg problem within the SUGRA approximation. It is 
unclear how many of the good features of the model will survive the 
resolution of this problem. It is clear that the vanishing cosmological 
constant will not. 

In short, this scenario is better than perturbative string theory, but not 
without its own flaws. On the other hand, the observation that gauge 
theories arise on branes of finite codimension is generic in M-theory and 
leads one to expect that Witten’s explanation of the Planck and unification 
scales is a correct one. 

At first sight, the above conclusions would seem to rule out the idea 
of modular inflation. If we are in the strong coupling regime and there is 
no reason for the superpotential to be small then what is our excuse for 
separating the moduli out from all the other variables of M-theory? What 
does the word moduli mean in the strong coupling regime with only four 
SUSYs? Worse, one of the points of [25] was that within the context of 
modular inflation, the energy scale during inflation is predicted to be near 
the unification scale. In Witten’s scenario, this scale is identified with the 
fundamental scale of quantum gravity and it seems unreasonable to use any 
sort of effective field theory description to describe this situation. 



®®The degeneracy in mass of the squarks is a desirable phenomenological feature. To 
the extent that it is valid it eliminates unwanted flavor changing neutral currents which 
threaten the viability of generic SUSY models. This success of the scenario is mitigated by 
the failure to stabilize the radial mode. The terms necessary to stabilize the radius come 
from corrections to its Kahler potential. Similar corrections could ruin the degeneracy of 
squarks. 
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In fact, I claim that the Hofava- Witten scenario and Witten’s use of 
it to explain the ratio between mp (the four dimensional Planck scale 
~ 2 X 10^® GeV) and M (the unification scale ~ 2 x 10^® GeV) may re- 
solve all of these problems. The key is that the higher dimensional theory 
has more SUSY than the effective theory below the KK scale. The higher 
dimensional SUSY is broken by the branes, but if the bulk volume is large 
then this breaking can be ignored for some purposes. In particular, we can 
identify the moduli space as that of the higher dimensional theory. Thus, 
in such scenarios, a clearcut notion of approximate moduli survives at all 
energy scales, as long as we remain in a regime where the compact volume is 
large. We will call these approximate moduli the inflamoduli to distinguish 
them from certain fields we will discuss below, which get their potential 
only from lower energy physics. 

Note that this is all compatible with the existence of a superpotential of 
order for the inflamoduli, and indeed this order of magnitude is reason- 
able for fields which parametrize properties of the bulk higher dimensional 
theory only if there is enhanced SUSY in the bulk. Otherwise we would have 
expected the effective superpotential of the moduli to contain a factor of 
the volume of the internal space. On the other hand, if the superpotential 
comes only from the vicinity of the branes, it has, by dimensional analysis, 
the form 

W = M^w{9a) (7.1) 

where 9a are dimensionless parameters characterizing the internal geometry. 
On the other hand, the kinetic term for these zero modes, just like the 
Einstein term for the zero modes of the gravitational field, is proportional 
to the volume Vy of the internal manifold, and has the form 



M^Vy^Gab{9)V9a^9b. 



(7.2) 



Note that M^Vy = rrip = is, as the notation indicates, the same 

coefficient which multiplies the Einstein action. Furthermore, although the 
volume Vy is itself a modulus, when we pass to the Einstein conformal frame 
in which Vy is replaced by its vacuum value, the kinetic term of the moduli 
is rescaled in precisely the same manner as the gravitational action. It is 
then natural to define canonical scalar fields by (fa = rnp9a- Their action 
has the form 



/ 






Gai,(0/mp)V0“V/ 



M® 

~^v{(j)/mp) . 



(7.3) 



Now let us examine the implications of a Lagrangian of this form for in- 
flationary cosmology. The slow roll equations of motion derived from this 
action are 



SHdf^/dt 



rrip d(j>^ 



(7.4) 
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and lead to the equation 

dv/dt = ^ davG^-^dbV (7.5) 

where da refers to the derivative with respect to the dimensionless variable 
0“ . We have also used the slow roll expression for H in terms of the potential. 
From (7.5) we immediately derive an expression for the number of e-foldings 



iVe = 3 



/ 



V 

davG^'^dbV 



daVdO"" 



(7.6) 



where the integral is over the trajectory in moduli space that the system 
follows during the time interval when the slow roll approximation is valid. 
We see that in order to obtain a large number of e-foldings we need a poten- 
tial which is flat in the sense that \dv\/v ~ 1/fVe. The phenomenologically 
necessary fVe ~ 60 can be achieved with only a mild fine tuning of dimen- 
sionless coefficients. Correspondingly, the conditions on the potential which 
ensure the validity of the slow roll approximation are order one conditions 
on the derivatives of the potential and do not contain any exponentially 
small dimensionless numbers. 

An additional feature of modular dynamics, which provides extra fric- 
tional damping of the motion of the moduli, was discovered in [25]. If 
we completely ignore the potential on moduli space, it is still an interacting 
nonlinear system. In [25] the equations for small fluctuations of the modular 
held theory around a solution of the equations of motion (without potential) 
for the zero modes, was studied, and an unstable mode was found. This 
was interpreted as an efficient mechanism for converting kinetic energy of 
the zero modes into energy of a gas of nonzero modes. It was estimated 
that the zero modes were effectively brought to a halt by this mechanism in 
less than a Lyapunoff time of the chaotic motion on moduli space. In the 
inflationary context, this mechanism will act as a source of friction which 
should make inflation much more probable. In particular, it is an avenue 
in which the large dimension of the moduli space (which can be a number 
of order 10^) could effect inflation, by providing a large number of degrees 
of freedom for efficient frictional damping of the zero mode motion. This is 
a topic which has not been investigated and deserves much more thorough 
study. 

The fact that actions of the form (7.3) give rise to inflation with minimal 
fine tuning, and that such actions naturally arise for moduli in string theory 
was pointed out in [25]. The general point that moduli might provide the 
flat potentialled, weakly coupled fields necessary to inflation was first made 
in [24] . Here we note that in brane scenarios, it is the bulk inflamoduli which 
play this role. There may also be moduli associated with branes, but they 
will have a natural scale M and have a quite different role to play. 
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Another pleasant surprise awaits us when we plug the potential from 
(7.3) into the standard formula for the amplitude of the primordial energy 
density fluctuations generated by inflation. Up to numbers of order one we 
find 



^ ~ Nx{M/mvf ~ 1Q-® (7.7) 

P 



where the numerical value comes from the measured cosmic microwave back- 
ground fluctuations, and A^a ~ 50. This gives M ~ (2/10)^/^x2x 10^® GeV, 
which, given the crudeness of the calculation, is the unification scale. To put 
this in the most dramatic manner possible, we can say that a brane scenario 
of the Hofava-Witten type, given the unification scale as input, predicts the 
correct amplitude for inflationary density fluctuations. Furthermore, the 
whole scenario only makes sense because of the same large volume factor 
that underlies Witten’s explanation of the ratio between the Planck and 
unification scales. This is necessary at a conceptual level to understand 
why it is sensible to think about a modulus with a super potential of order 
the fundamental scale, and at a phenomenological level to understand the 
magnitude of the density fluctuations. 

A detailed calculation of the fluctuation spectrum as opposed to its ab- 
solute normalization requires more knowledge of the potential v than we 
possess. A crucial question (posed during my lecture by Andre Linde) is 
how natural the phenomenologically necessary flat spectrum is in this con- 
text. I leave it as an exercise for the enterprising student. 

Although it has no connection with our discussion here I cannot resist 
pointing out the other piece of evidence for a scale of the same order as 
M. Any theory of the type we are discussing would be expected to contain 
corrections to the standard model Lagrangian of the form (in superfield 
notation) which gives rise to neutrino masses. It is a matter of 

public record [58] now that such masses exist, with an estimated value for 
M between .6 and 1.8 x 10^® GeV. Although this is an order of magnitude 
shy of the unification scale I believe the uncertainties in coefficients of order 
one in dimensional analysis could easily make up the difference. If not, we 
will have the interesting problem of explaining the existence of two close 
but not identical energy scales in fundamental physics [61]. 

We also want to note that this scenario for inflation does not suffer from 
the runaway problem pointed out by Brustein and Steinhardt [54]. These 
authors noted that the inflationary vacuum energy is much larger than the 
SUSY breaking scale. Furthermore, the minimum of the effective potential 
was assumed close to the region of weak string coupling. There was then a 
distinct possibility that the inflaton field would overshoot the small barrier 
separating it from the extreme weak coupling regime where string theory 
is incompatible with experiment. In the present scenario, the coupling is 
not assumed to be weak (nor the volume extremely large). Furthermore the 
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inflationary potential has nothing to do with SUSY breaking. There is no 
runaway problem at all. 

The authors of the papers in [25] agonized over the discrepancy between 
the uniflcation scale and the scale of SUSY breaking. In fact, they discussed 
and discarded what I now believe is the obvious solution of this problem, 
because of problems specific to weakly coupled string theory^^. The obvious 
way to avoid SUSY breaking at the scale M, is to insist that the superpo- 
tential (7.1) has a SUSY minimum. In fact, the existence of such minima is 
generic, requiring only the solution of n complex equations for n unknowns. 
However, in general, the superpotential will not vanish at such a minimum 
but instead will give rise to a negative cosmological constant. 

It was pointed out in [26] that in postinflationary cosmology, the uni- 
verse’s attempt to access such a SUSY minimum of the effective potential 
leads to a very welcome cosmological disaster. The key point is that inflation 
has completely eliminated the spatial curvature terms from the cosmological 
equations, so that the Friedmann equation reads 

mp{d/a)^ = GABm^m^ + V. (7.8) 

This does not have static solutions with resting at a minimum of V 
with negative value. What happens instead is that a generic solution of 
the cosmological equations^® reaches a point where a = 0 and then begins 
to recollapse to infinite energy density. This happens on a microscopic 
time scale. Thus inflationary cosmology eliminates generic SUSY preserving 
minima of the effective potential from the list of late time attractors of the 
cosmological equations. 

The stable postinflationary attractors of a supersymmetric cosmology 
are points in inflamoduli space with vanishing superpotential and SUSY 
order parameters. These can be characterized in terms of a symmetry. 
Namely, any complex R symmetry forces the superpotential to vanish, and 
if there are no fields of R charge 2 then the SUSY order parameter vanishes 
as well. The R symmetry must of course be discrete, since we are discussing 
M-theory^®. If in addition, there do exist fields of R charge 0, then there will 
be an entire submanifold on which the superpotential vanishes and SUSY is 
preserved. Our future considerations will concentrate on this submanifold, 
which from now on we call the true moduli space, since it is the oft advertised 



^^Namely the fact that superpotentials are exponentials of exponentials of the canoni- 
cally normalized dilaton field. 

There are very special solutions in which the universe is static and the scalar fields 
oscillate in the potential with exactly zero energy, and I once thought that these were 
relevant to the cosmological constant problem. However, they are unstable to small 
perturbations. 

®®This is an example of the nonexistence of continuous global symmetries. 
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exception to our statement that quadrisusic backgrounds had no moduli 
space. It is the locus of restoration of a discrete R symmetry with the above 
properties. We should expect the true moduli space to have more than one 
connected component, each characterized by a different R symmetry. 



7.3 Radius stabilization 

Every silver lining has its cloud. The discussion above treated the four 
dimensional Planck scale as a fixed parameter. In fact, in the Hofava- Witten 
scenario, it is determined by the radius of the fifth dimension, which is one 
of the moduli. In fact it is one of the bulk moduli and might be expected 
to vary during inflation. 

At first glance, the situation appears to be much worse than that. In 
the limit of large R, the Lagrangian of the field R is highly constrained by 
extended SUSY. In this limit the Kahler potential of the superfield T which 
contains R is fixed to be —Snip ln(T +T*). In the analysis of [44,45], the su- 
perpotential was supposed to be generated only by gaugino condensation on 
the hidden brane, separated by a distance R from the brane where the stan- 
dard model lives. This is a function only of a particular linear combination 
S, where S is the superfield which controls the coupling of the hidden sector 
gauge group. The superpotential can also depend on the other moduli, e.g. 
the complex structure moduli of the Calabi-Yau threefold, as well as the vec- 
tor bundle moduli in the hidden gauge group. Although this superpotential 
is not explicitly calculable, it will generically have a supersymmetric point 
with S fixed to be small (the hidden gauge theory is strongly coupled) and 
the complex structure and hidden sector gauge bundle moduli fixed. Unless 
there are points of enhanced discrete R symmetry, as described above, the 
superpotential will be nonvanishing at the SUSY point and of order M^. 

The fact that the superpotential is of order means that it cannot re- 
ally be considered to have originated in some “low energy effective theory” , 
but comes from physics at the fundamental M-theory scale. The possibility 
of superpotentials generated at short distance was not appreciated in [44] 
and [45], nor as far as I can tell in any of the papers on M-theory phe- 
nomenology which have appeared since that time. I do not see any good 
argument for omitting such terms in the low energy Lagrangian. However, 
there is a symmetry argument that such a superpotential will be of the form 
X)n>o ^n{S, ^ -v\rhere A: is a number of order one. The factors 

in the exponent will be explained below. Here C is a collection of superfields 
representing the complex structure moduli, as well as vector bundle moduli 
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for the gauge configurations on each wall^^. The imaginary part of T comes 
from a pure gauge mode of the bulk graviphoton, which is chosen to vanish 
on the hidden sector wall. The gauge symmetry becomes a shift symmetry 
for ImT. One may expect this symmetry to be broken by effects involving 
membrane instantons stretched between the walls, and by fivebranes (which, 
in the walls, are gauge theory instantons). As a consequence, a discrete 
remnant of the shift symmetry remains, and this is what constrains the 
superpotential in the manner described above. 

Thus, in the large R limit, one expects the Kahler potential of the field T 
to be given by its asymptotic form, and the superpotential to be independent 
of T. As a consequence, even if we assume the inflamoduli are slowly rolling 
at some point away from the minimum of their potential, the dynamics of 
the universe will be strongly influenced by the motion of T. It is easy to 
see that the real part of T is, in Einstein frame, related to a canonically 
normalized scalar field with an exponential potential. The slope in the 
exponent is outside the range in which (power law) inflationary solutions of 
the equations of motion exist. Other sources of friction for T must be found 
if inflation is to take place. 

There are several obvious sources for such extra friction. The first is 
the imaginary part of T, which, in the large R approximation, behaves 
like a Goldstone field. Unfortunately, this means that the energy density 
associated with this field, and the extra friction associated with it, scales 
away like 1/a®. While I have not done a proper numerical study of this 
system it seems unlikely that it will have long periods of inflation for generic 
initial conditions®®. 

Two other sources of extra friction are the excitation of nonconstant 
modes of the T field, and Kaluza-Klein particle production. In [25] it was 
argued that the first of these mechanisms is very efficient at stopping the 
chaotic motion on moduli space with no potential. As noted above, there is 
an instability which converts modular zero mode kinetic energy into a gas of 
nonzero modes within less than a Lyapunoff time of the chaotic motion on 
moduli space. It seems quite plausible that in the presence of an exponential 
potential one would then have inflationary solutions. Kaluza-Klein particle 
production is also to be expected in the presence of a rapidly moving T field, 
because the real part of T directly influences the masses of these particles. 

Obviously, more work is needed to see whether these mechanisms can 
really salvage the inflationary scenario of the previous section. Even if they 
do, one mystery still remains. Although some combination of these effects 



®^Here and henceforth we restrict attention to CY threefolds with only a single Kahler 
modulus and disregard the possibility of inserting M5 branes in the bulk between the two 
walls. 

®®Remember that unlike the case of the other moduli fields, there are no unknown 
parameters in the asymptotic Lagrangian for T. 
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can explain why T is slowly varying during inflation, there is no explanation 
of why it is close to its vacuum value. Since the four dimensional Planck 
mass (and through it our successful prediction of the magnitude of energy 
density fluctuations) depends exponentially on the canonically normalized 
held constructed from the real part of T, it is extremely important to explain 
this coincidence. 

Another possibility for rescuing inflation comes from the recognition that 
the radial modulus has a Dine Seiberg instability. That is to say, although 
we would like to be doing a systematic asymptotic expansion in R, we 
know that we will never And a stable minimum for T in this approximation. 
Thus we should admit that near the vacuum value for T, the large radius 
expansion for (at least) the effective potential of this field has broken down. 
Let us recall that we defined T in terms of the deviation of the radius from 
its vacuum value [45]. Thus, RM ~ {mpT/M'^). On physical grounds, we 
expect corrections to the asymptotic form of the Lagrangian to be functions 
of RM. In the case of the T dependence of the superpotential discussed 
above, this guess can be verified by analytic continuation from the region 
of weakly coupled string theory [45] . 

The potential for T during inflation has two terms. The first, coming 
from the F terms of the other moduli was discussed above, and is all that 
exists in the extreme asymptotic limit. In that limit, it gives an exponen- 
tial potential with slope of order 1 /mp for the canonically normalized field 
~ mpln Re (T/mp). The second term has the form: 

V ~ e^/™p[iF^^*|A:T/TOpP - 3]|IU/mp|^ (7.9) 

where K is the Kahler potential of T. The implication of the previous 
paragraph is that there is a region of T /mp of order one, where K is very 
different from its asymptotic form, and varying rather rapidly as a function 
of this variable. Now consider initial conditions where RM starts out close 
to one and growing. The T field will then have to cross a regime in which 
the rapidly varying piece of the potential is significant before it can access 
the asymptotic regime. If the other moduli are slowly rolling, it is clear that 
it will instead be rapidly driven very close to the minimum of its potential. 
Unfortunately, I have no argument that this is the same as its VEV. 

Indeed, we will see in the next section that the minimum of the low 
energy potential for T is the same as that of (7.9). There is no obvious 
reason to expect the first term of the potential (proportional to the F terms 
of other chiral fields) to be negligible compared to (7.9). Thus, although this 
mechanism saves inflation, it is not clear that it preserves our explanation 
of the size of primordial fluctuations. 

Our discussion of the end of inflation is also modified. Once the contri- 
bution of T is taken into account, the cosmological constant (at the end of 
inflation, but neglecting low energy gauge dynamics) is given by the value of 
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(7.9) at its minimum, with the other moduli set at SUSY preserving values. 
Points with nonvanishing superpotential will now have SUSY spontaneously 
broken by the F term of the T field. If we insist that the low energy cosmo- 
logical constant vanishes exactly (in the scenario with discrete R symmetry 
broken by low energy dynamics), then these points will also have vanishing 
cosmological constant and will be attractors of the postinflationary cosmo- 
logical equations. This is unfortunate, because these points have gravitino 
masses of order M^/mp and are ruled out by phenomenonology. It would 
have been pleasant to find that they were also disfavored by cosmological 
evolution. In the next section we will see that we can still recover accept- 
able phenomenology at points of enhanced R symmetry (broken only by low 
energy dynamics). 

There is a (weakly anthropic) way of understanding why points in mod- 
uli space with R symmetry broken at high energies could be ruled out by 
cosmology as well as phenomenology, if we accept that there is a nonva- 
nishing cosmological constant in the world we observe. Then the ratio of 
cosmological constants between the R asymmetric worlds and our own is 
~ (M/n)^, where /r is the scale of low energy R symmetry breaking. If 
one insists on a low energy SUSY breaking scale of order a TeV /r is fixed 
at about 10^^ GeV (see below). This gives the R asymmetric worlds a 
De Sitter horizon size of about a light year. There is certainly no galaxy 
formation in such a universe, and it does not take a degree in exobiology 
to conclude that no life is possible there. There is no plausible initial (post 
primary inflation) matter distribution which leads to any appreciable late 
time matter inside a horizon volume, unless it is collapsed into black holes. 

Finally, one should note that at small values of T (values of RM 
of order 1) there might be a SUSY minimum of the potential for T. This 
regime is hard to discuss because effective field theory does not apply to 
it and the notion of effective potentials, approximate moduli, and classical 
spacetime are all suspect. However, even if one assumed that such a mini- 
mum existed, one would find that one could not access it after inflation and 
it would be irrelevant to macroscopic physics. 

I have divided the discussion of inflation on moduli space into two parts, 
initially ignoring the problem of the radial modulus, because I suspect that 
it may be possible to find other scenarios in which this problem is completely 
absent. I will make a similar division of the discussion of SUSY breaking 
below. 

7.4 SUSY breaking 

Before proceeding to the discussion of SUSY breaking on the true moduli 
space, we should introduce the final characters in our story, the boundary 
or brane moduli. In Calabi-Yau compactification of weakly coupled string 
theory, there are moduli which correspond to the parameters of the Eg, x Eg 
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gauge field configuration on the manifold (these are called vector bundle 
moduli in the string compactification literature) . In a brane scenario these 
moduli should be thought of as living on the branes where the gauge fields 
live. In the strong coupling regime, these fields will have a superpotential of 
the form M^W{b/M) and it is not clear that they should be called moduli at 
all. Some of them may be invariant under the discrete complex R symmetry, 
and thus belong to the true moduli space. In perturbative string theory, 
some vector bundle moduli have components 9h which couple to gauge fields 
like axions: O^FF. The decay constants of these axions are of order M 
because, since they live on the brane, no other scale can enter their kinetic 
terms. 

In our later considerations, we will have need of a field with a decay 
constant of order M and a very small potential energy. The vector bundle 
moduli on the standard model wall have the first of these properties. In 
perturbative string theory these fields have Peccei-Quinn symmetries which 
are broken only by world sheet instantons. It is then plausible that in the 
Hofava- Witten regime the dominant breaking of these symmetries comes 
from nonperturbative QCD. The potential energy of one of the gauge bundle 
axions would be much smaller than any fundamental scale, and would have 
the form AQQY)u{a / M) . We will consider the possibility that there are other 
moduli of this type, with a variety of scales replacing Aqcd • 

In addition to these moduli fields, any brane scenario will contain a va- 
riety of gauge fields and matter fields in nontrivial representations of the 
gauge group. The moduli will interact with these fields via the moduli de- 
pendence of bare gauge and yukawa coupling parameters in the effective 
theory as well as thru a variety of irrelevant operators. If the gauge cou- 
plings are asymptotically free and do not run to infrared fixed points at low 
energy, this description of the physics only makes sense if the bare gauge 
couplings are sufficiently small that the scale at which the effective coupling 
becomes large is substantially below the scale M. Otherwise it is not con- 
sistent to include the gauge degrees of freedom in the low energy effective 
theory. The weakness of bare couplings in these scenarios is not evident 
a priori, as it would be in a purely perturbative approach. The underly- 
ing physics is assumed to be strongly coupled. Witten [44] has shown how 
the small unified coupling of the standard model can be explained in terms 
of a product of a large number of factors of order one in a geometry of 
large dimensions. We will assume that similar numerical factors explain the 
strength of the gauge interactions that lead to SUSY breaking. 

The main role of the gauge interactions is not to break SUSY, but rather 
the discrete R symmetry. If we fix the moduli and treat the gauge theory 
as a fiat space quantum field theory, then SUSY remains unbroken even 
though a nonperturbative superpotential is generated. The scale of this 
superpotential is determined via a standard renormalization group analysis 
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in terms of the bare gauge coupling function /(^/wp, x/M), where we have 
indicated dependence on both bulk and boundary moduli. For simplicity 
we assume that / is a large constant /o plus a smaller, moduli dependent, 
term. The conclusions are not affected by this assumption. The scale /i of 
the nonperturbative superpotential is then determined by /q. It takes the 
form 

Wi = ^i^wi{4)/m-p,x/M). (7.10) 

We have eliminated all (composite) superfields related to the gauge interac- 
tions from this expression by solving their F and D flatness conditions for 
fixed values of the moduli. The possibility of doing this is equivalent to the 
statement that the gauge theory does not itself break SUSY. We assume 
that W\ does not vanish at any minimum of the effective potential. This 
is the statement of spontaneous R symmetry breaking. As a consequence, 
SUSY minima of the potential have negative cosmological constant of or- 
der at least /x®/mp and are not attractors of the cosmological equations. 
Thus, cosmologically, R symmetry breaking forces the moduli to choose a 
minimum with spontaneously broken SUSY^®. 

Phenomenology puts an upper bound on the value of /i because it con- 
tributes directly to squark masses. The nonvanishing F terms are of order 

3 3 

A standard argument shows that squark masses will be of order 
about the same as the gravitino. Assuming this is about a TeV we find 
H ~ 10^^ GeV. An attractive feature of this scenario is that the positive 
and negative terms in the SUGRA potential are naturally of the same or- 
der of magnitude. Although we have no real understanding of why the 
cosmological constant is so small, this fact of nature is an indication of a 
relation between the scales of R symmetry breaking and of SUSY break- 
ing. In models in which the SUSY breaking F term originates as a bulk 
modulus the correct order of magnitude relation between these scales arises 
automatically. 

As we now recall, a deficiency of this scenario for SUSY breaking is that 
it leads to the cosmological moduli problem. The scalar fields in the bulk 
moduli multiplets acquire masses from the SUSY violating potential of order 
iriM ~ which is the same order of magnitude as the gravitino and 

squark masses, i.e. a TeV. They have only nonrenormalizable couplings to 
ordinary matter, scaled by mp. Thus, their nominal reheat temperature, 
is of order ~ 3 x 10“^ MeV, and the universe is matter dominated 
at the time that nucleosynthesis is supposed to be taking place. The thermal 
inflation scenario [56] can solve this problem, and we will now review another 
solution [49]. 



®®The tunneling amplitudes of such nonsupersymmetric vacua into supersymmetric 
AdS vacua are incredibly tiny and might be identically zero, as discussed in [26]. 
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Suppose that the coefficient in the order of magnitude relation between 
the moduli mass and the fundamental parameters is tom = 5 x /i^/mp, 
while the squark mass is actually rriq = fj? /Arri^ = 1 TeV. Then the reheat 
temperature for the bulk moduli is multiplied by a factor of 20^^^ ~ 10^ 
and is just above 1 MeV. Thus, an innocent looking insertion of factors of 
order one can cause the moduli to decay just in time to light the furnace in 
which the primordial elements are forged. 

One still has to account for baryogenesis. Adopting a mechanism sug- 
gested long ago by Holman et al. [59] we aver that this can come from the 
decay of the moduli themselves. All of their interactions are of order the 
fundamental scale of M-theory, so there is no reason for them to preserve 
accidental symmetries like baryon and lepton number. It is quite reasonable 
that they also violate CP, though the status of CP in M-theory is somewhat 
more obscure. The decay itself is an out of equilibrium process, so all of 
the Sakharov criteria for baryogenesis are fulfilled. However, we must also 
take note of the theorem of Weinberg [60] , according to which baryon num- 
ber violating terms in the Hamiltonian must act twice in order to generate 
an asymmetry. In the decay of moduli, the first action of the Hamiltonian 
comes at no cost in amplitude, because the modulus must decay somehow 
and there is no reason for its baryon number violating decays to be sig- 
nificantly smaller than those which conserve baryon number. However the 
second baryon number violating interaction should not be highly suppressed 
if we want to generate a reasonable baryon asymmetry. Indeed, a 10 TeV, 
gravitationally coupled, particle which produces a baryon asymmetry of or- 
der one in its decay, also produces of order (10TeV/3MeV) or ~ 3 x 10® 
photons. Thus a large suppression of the average baryon number per decay 
would give too small a baryon asymmetry. A way out of this difficulty is to 
admit renormalizable baryon number violating operators in the supersym- 
metric standard model. Discrete symmetries such as a Z 2 lepton parity [57] 
can adequately suppress all unobserved baryon and lepton number violating 
processes in the laboratory, while allowing such operators with coefficients 
as large as 5 x 10“®. This might be large enough to produce the observed 
baryon asymmetry. 

An unfortunate casualty of this mechanism is the lightest SUSY particle. 
The LSP is no longer stable in the scenario described above and we have 
to look elsewhere for a dark matter candidate. However, there are natural 
candidates for dark matter. Imagine a boundary modulus whose potential 
energy is substantially smaller than the the estimate coming from 

(7.10). We will call this the dark modulus, because it will be our dark 
matter candidate. It has a potential of the form U = A^u{D/M). (In [49], 
where this scenario was first proposed, the candidate was a QCD axion field. 
This model works, but the mechanism is much more general and does not 
require energy densities as small as those of the axion.) 
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Now, briefly review cosmic history. First we have inflation generated by 
bulk moduli flelds which are not on the true moduli space (which we have 
called inflamoduli). This period ends after of order 100 e-foldings, and the 
universe is heated by inflamoduli decay to a temperature of order 10® GeV. 
The primordial plasma quickly redshifts away. Furthermore, as soon as the 
inflamoduli potential energy density falls to the universe becomes 

dominated by the coherent oscillations of the true bulk moduli. The dark 
modulus remains frozen at some generic point on its potential until the 
Hubble parameter falls to the mass scale of this held. At this point the 
energy density of the universe is of order p ~ which is of order 

(mp/M)'^ ~ lO'^ times larger than the energy density of the dark modulus. 
The important point now is that this ratio is preserved by further cosmic 
evolution until the true bulk moduli decay. After that time, the dark energy 
density grows linearly with the inverse temperature relative to radiation, 
and matter radiation equality occurs at 10“^ MeV. This is close enough 
to the true value for the observable universe that the factors of order one 
which we have neglected throughout might account for the difference. A 
must satisfy two constraints in order for this scenario to work: the dark 
moduli must remain frozen until the true bulk moduli begin to oscillate, 
and the dark modulus must have a lifetime at least as long as the age 
of the universe. The second constraint is by far the stronger, and leads to 
A < 3 X 10® GeV. Axions satisfy this constraint by a large margin. Note that 
this scenario completely removes the conventional cosmological constraint 
on the axion decay constant. Axions will be very weakly coupled and will 
escape all of the usual schemes for detecting them. 

Another possible mechanism for baryogenesis in this scenario is that of 
Affleck and Dine [2]^®. Indeed the authors of [3] have investigated a scenario 
with a 10 TeV modulus and Affleck Dine baryogenesis and found that it can 
account for all cosmological data. In this scenario the dark matter can either 
be an LSP, or if we have strong R parity violating interactions, the dark 
modulus (or a combination) . 

All in all, this seems to be the simplest solution of the cosmological 
moduli problem, and has the added virtue of allowing an invisible axion 
solution of the strong GP problem. I am also fond of the way in which the 
version of this scenario with a dark modulus predicts the correct (within an 
order of magnitude) temperature for matter radiation equality in terms of 
fundamental parameters. 

For completeness, we should also discuss the possibility that SUSY 
breaking itself is caused by gauge interactions which are weakly coupled at 
the fundamental scale. This is required if we assume, with Dine [41,42,47] 



^^This was suggested to me by a student at the school. I thank M. Dine for detailed 
discussions of it and for pointing out the reference below. 
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that moduli are fixed at some enhanced symmetry point. Scenarios of 
this sort are attractive because they allow us to use the idea of gauge 
mediation [50] to solve the SUSY fiavor problem. Gauge interactions gener- 
ate superpotentials of the form ix\wg-y{Ci/mi) + , where the 

C s are composite superfields and the rrii the nonperturbative low energy 
scales generated by asymptotic freedom. Here, in order to cancel the cosmo- 
logical constant, we must introduce an R breaking gauge theory with scale 
(mi), which preserves SUSY and a SUSY breaking gauge theory, with scale 
related by m\ = mpm|. This is the price one must pay for giving up the 
idea that true bulk moduli are the instigators of SUSY breaking. The ratio 
of scales between SUSY and R breaking no longer comes out naturally, but 
must be put in by hand. In compensation there is no cosmological moduli 
problem in this picture, since all moduli are assumed to be frozen by the 
initial superpotential. 

7.5 The effects of a dynamical radius 

We now have to include the dynamics of the radial modulus T. The R 
symmetry violating superpotential has an expansion^^ 

OO 

W = ^/i3lU„(m/mp)e-"^/’”'’. (7.11) 

n—0 

At large radius the exponential terms are negligible. We then have no scale 
SUSY breaking even if all other bulk moduli have SUSY minima^^. One 
can then hope, as in [45], that the radius is stabilized by higher order terms 
in the Kahler potential. This would give a SUSY breaking scale close to 
[j? The resulting scenario is similar to that of the previous section. 

There is a much more substantial difference in the case where (what we 
previously called) the true moduli space is a point. T still plays the role 
of a true modulus, and we again get no-scale SUSY breaking when the low 
energy theory violates R symmetry without breaking SUSY. We can, if we 
wish, also add a low energy SUSY breaking sector, but to leading order in R 
this leads to a large positive cosmological constant This is true no matter 



is important that, as a consequence of our assumption of an R symmetry under 
which T is neutral, all terms in this expansion are proportional to the R breaking scale \i? . 
This means that we cannot invoke mechanisms like that of [1] to explain the stabilization 
of the radius. 

^^However, once we take into account the SUSY violating potential coming from the F 
term of T, there is no reason to assume that the other fields sit at their SUSY minima. 
The minimum of the potential might be achieved with F terms for all the fields. 

^®It should be noted that a large positive cosmological constant is not a disaster only 
for our ability to “fit the data” . The size of the event horizon for a De Sitter space with 
energy density of scale 1 MeV is about a light second in linear size, and for the scale 
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what we choose for the relative scales of low energy SUSY breaking and R 
symmetry breaking (as long as we try to be consistent with the lower bound 
on superpartner masses). Thus, once the radius is allowed to be dynamical 
there do not seem to be consistent scenarios with gauge mediated SUSY 
breaking. 



7.6 Generalizing Horava-Witten 

As we have noted, the moduli space of 11 dimensional SUGRA compacti- 
fications which preserve Af = 1 SUSY in four Minkowski dimensions splits 
into three components. These are Joyce sevenfolds, F theory limits of com- 
pactification on Calabi-Yau fourfolds, and Heterotic limits of compactifi- 
cation on AT3 x CY 3 . These may be continuously connected when short 
distance physics is properly taken into account. In addition, there may be 
many branches of moduli space which join onto these through generalized 
extremal transitions. The moduli space is thus highly complex. 

The cosmological arguments of these lectures indicate that the phe- 
nomenologically relevant compactifications may belong to a highly con- 
strained submanifold of this complicated space. Namely, they should pre- 
serve eight supercharges in the bulk. The breaking to Af = 1 should occur 
only on branes. SUGRA compactifications preserving eight SUSYs are much 
more constrained. The holonomy must be contained in SU(5) which implies 
that the manifold is the product of a Galabi-Yau threefold times a torus, 
modded out by a discrete group F. In order to obtain a smooth manifold 
with eight SUSYs, F should act freely and the holonomy around the new 
cycles created by F identification should be in SU{3). Glearly, a way to 
obtain Hofava- Witten like scenarios is to allow fixed manifolds of F, on 
which an additional SUSY is broken. The original scenario of Hofava and 
Witten was a CY 3 x compactification in which F is a Z 2 reflection on 
the S'^. The flxed planes carry Ag gauge groups, and one must also choose 
an appropriate gauge bundle. A further generalization allows five branes 
wrapped on two cycles of CY 3 to live between the planes. 

It seems likely that more complicated choices of F might lead to a wider 
class of scenarios. The problem of classifying scenarios of this type seems 
quite manageable"^^. The moduli space of compactifications of M-theory 
on CY 3 times a torus has a reasonably complicated structure, replete with 
extremal transitions. Nonetheless, it is considerably simpler than the four- 
fold or Joyce manifold problem, and we know much more about its struc- 
ture. Thus, if cosmology really points us in the direction of generalized 



of SUSY breaking it is smaller by a factor of 10^^. In a theory with multiple late time 
attractors it is not hard to explain why we are not there to observe such a universe. 
^^Preliminary results on the classification problem have been obtained by Moth 
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Hofava-Witten compactifications, we have made real progress in the search 
for the true vacuum of M-theory. 

7. 7 Conclusions 

Witten’s explanation of the discrepancy between the Planck and unification 
scales in the context of Hofava-Witten compactifications, poses a challenge 
for inflationary cosmology and particularly for the notion that moduli are 
inflatons. In fact, the enhanced bulk SUSY of these compactifications gives 
us a clean definition of modular inflatons. The scenario then makes an order 
of magnitude prediction of the amplitude of primordial density fluctuations 
in terms of the unification scale. The major problem with this inflationary 
scenario comes from stabilization of the radius of the Hofava-Witten orb- 
ifold. In leading order in the large radius approximation, radial dynamics 
appears to destroy inflation. We pointed out several sources of friction for 
the radius field, which could restore inflationary solutions, but there is more 
work to be done here and a mystery remains. Assuming the radion is slowly 
rolling during inflation, why is it near its vacuum value? 

An alternative, which seems more compelling, is to recognize that the 
Dine Seiberg problem for the radial field probably requires us to contemplate 
the breakdown of the large radius expansion for its Kahler potential near the 
true VEV of this field. We argued that this meant that the Kahler potential 
was rapidly varying (as a function of T/mp) near the low energy VEV and 
that this implied that the radius would not be an inflaton but instead would 
rapidly be driven to the minimum of its potential during inflation. It is not 
clear whether the inflationary minimum is close enough to the VEV to 
salvage our explanation of the size of density fluctuations. This depends on 
properties of the Kahler potential which are, at the moment, incalculable. 

In the context of this large class of inflationary scenaria, arguments first 
discussed in [26] then focus attention on the true moduli space of M-theory, 
a locus of enhanced discrete R symmetry. Such a space almost certainly 
exists [52]. It is the attractor of postinflationary cosmological evolution. 
The further evolution of the universe then depends on whether this space 
contains bulk moduli. In the attractive scenario in which it does, the ini- 
tial Hot Big Bang generated by inflation, is soon dominated by the energy 
density stored in coherent oscillations of true bulk moduli. By making opti- 
mistic but plausible assumptions about coefficients of order one in order of 
magnitude estimates, one obtains a reheat temperature above that required 
by nucleosynthesis. The decay of true bulk moduli, rather than that of the 
inflaton, generates the Hot Big Bang of classical cosmology. The baryon 
asymmetry might also be generated in these decays, and this is possible if 
the SUSY standard model contains renormalizable baryon number violating 
interactions (compatible with laboratory tests of baryon and lepton num- 
ber conservation). As a consequence of this, there is no LSP dark matter 
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candidate. Instead, boundary moduli with a suppressed potential energy act 
as a natural source of dark matter. Indeed, the ratio between the Planck 
and unification scales appears again in this scenario, this time in explaining 
the temperature at which matter and radiation make equal contributions 
to the energy density of the Universe. This estimate comes out an order of 
magnitude too high, but given the crudity of the calculation it seems quite 
plausible that this mechanism could be compatible with observation. The 
“dark modulus” which appears in this scenario could be a QCD axion with 
decay constant of order the unification scale. Our unconventional origin 
for the Hot Big Bang completely removes the cosmological upper bound 
on this decay constant. Such a particle would be undetectable in presently 
proposed axion searches. 

An alternative is to postulate the Affleck-Dine mechanism as the source 
of the baryon asymmetry in this late decaying modulus scheme. Dark matter 
could then be an LSP, a unification scale QCD axion, or some combination 
of the two. 

If a cosmology like that outlined here turns out to be correct, one might 
be tempted to revise Einstein’s famous estimate of the moral qualities of a 
hypothetical Creator. The current standard model of cosmology was con- 
structed in the sixties. Since then there has been much speculation about 
cosmology at times earlier than that at which the primordial elements were 
synthesized. Most of it has been based on an eminently reasonable extrap- 
olation of the Hot Big Bang to energy densities orders of magnitude higher. 
If the present scenario is correct, no such extrapolation is possible, and the 
conditions in the Universe in the first fraction of the First Three Minutes 
were considerably different from those at any subsequent time. There was a 
prior Big Bang after inflation, whose remnants may be forever hidden from 
us. The dark matter which dominates our universe is so weakly coupled 
to ordinary matter that its detection is far beyond the reach of currently 
planned experiments. The QCD and electroweak phase transitions never 
occurred. 

The only dramatic prediction of this scenario for currently planned ex- 
periments is the occurrence of renormalizable baryon number violation in 
the low energy SUSY world^®. The details of the baryogenesis scenario 
envisaged here should be worked out more carefully, and combined with 
laboratory constraints, to nail down precisely which kind of operators are 
allowed. The scenario is thus easily falsiflable, but even the discovery of 
renormalizable baryon number violating interactions among SUSY parti- 
cles will not be a confirmation of our cosmology. Similarly, any evidence 
for the existence of more or less conventional WIMP dark matter will be a 
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strong indication that the present speculations are incorrect, but the failure 
to discover WIMPS will not prove that they are correct. 

Instead one will have to rely on the slow accumulation of evidence against 
alternatives: ruling out vanishing up quark mass and spontaneous CP viola- 
tion as solutions to the strong CP problem, the failure of conventional axion 
and WIMP searches, the discovery of renormalizable B violation. These will 
be steps on the road to proving that this cosmology is correct, but the end 
of that road is not in sight. 

We have travelled a long road, from the exotic reaches of M-theory to 
what I hope have been glimpses of more practical applications of modular 
physics to cosmology. I hope I have convinced you that the moduli of 
M-theory are likely to play a crucial role in any inflationary cosmological 
model and that many of the phenomenological and fundamental problems 
of M-theory are likely to be resolved in a cosmological context. Perhaps the 
somewhat unorthodox cosmological scenaria presented here will also prove 
to be more than just a theorist’s toys, and will play some role in the future 
of cosmology. 
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Abstract 

A review is attempted of physical motivations, theoretical and phe- 
nomenological aspects, as well as outstanding problems, of the pre- 
Big Bang scenario in string cosmology. 



1 Introduction 

These four lectures aim at providing a summary of -and some guidance 
through- the existing literature dealing with the so-called pre-Big Bang 
(PBB) scenario, a new cosmological model largely based on the new sym- 
metries underlying superstring cosmology. The lectures will be pedagogical 
in nature and will not presuppose an advanced knowledge either of modern 
inflationary cosmology or of superstring/M-theory. Elements of both will 
be included in the lectures in order to make them reasonably self-contained. 
More exhaustive treatments of pre-Big Bang cosmology are [1] (or will soon 
be [2]) available elsewhere, while a homepage on the PBB scenario is being 
kept updated on the Web [3]. 

The four lectures roughly correspond to the four forthcoming sections 
and deal, respectively, with: 

• BASIC MOTIVATIONS AND IDEAS 

• HOW COULD IT HAVE STARTED? 

• PHENOMENOLOGICAL CONSEQUENCES 

• HOW COULD IT HAVE STOPPED? 

In particular, Lecture II (Sect. 3) contains a discussion of the initial con- 
ditions, Lecture HI (Sect. 4) discusses the phenomenological virtues and 
shortcomings of the model, while Lecture IV (Sect. 5) deals with the most 
important open theoretical issues. 
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2 Basic motivations and ideas 

2. 1 Why string cosmology? 

The first question that comes to one’s mind when thinking about 
cosmology and string theory is: why bother? Indeed, even if 

string/M-theory is the correct theory of nature, only its effective (low- 
energy) quantum field theory description appears to be relevant to most of 
the history of our Universe, i.e. since a very short time after the Big Bang. 
This is certainly the case for the standard (hot-Big-Bang) cosmological 
model, but it is also true for the standard models of inflation, provided 
we confine our attention to what happened during the last 70 e-fold of in- 
flation and later {i.e. to what happened after our present horizon reached 
the size of the inflationary Hubble radius). In both instances, one is only 
confronting situations in which curvatures are very small with respect to 
the fundamental scale of string theory. 

On the other hand, both the hot-Big Bang model and its inflationary 
variant suffer from initial condition problems. In the former case, these 
are just the well-known homogeneity and flatness problems that motivated 
inflation. In the latter case, although the problems look less severe, it is still 
a matter of heated discussion whether or not one should naturally expect a 
quasi-homogeneous inflaton field highly displaced from the minimum of its 
potential to emerge from the Planck era. In either case, the question of how 
to get physically appealing initial conditions lies in the realm of Planck-scale 
quantum gravity. 

At present, the only candidate for a consistent synthesis of general rel- 
ativity (GR) and quantum mechanics (QM) is superstring theory (see [4] 
for a recent review, as well as [5] for a non-specialized introduction), or, 
if we prefer, the mysterious M- theory that reduces to various superstring 
theories in appropriate limits. It thus seems mandatory to ask whether the 
above questions on initial conditions do -or do not- find an answer within 
string theory. Although most string theorists would certainly agree with 
the above statements -this being after all one of the most selling ads for 
string theory- many of them would still object to tackling these problems 
now. The “excuse” is that our understanding of string theory, especially at 
large curvatures, is still largely incomplete. Furthermore, most of the recent 
progress in non-perturbative string theory has been achieved in the context 
of “vacua” {i.e. classical solutions to the field equations) that respect a 
large number of supersymmetries. By definition, a cosmological background 
{a fortiori one that evolves rapidly in time) breaks (albeit spontaneously) 
supersymmetry. This is why the Planckian regime of cosmology appears to 
be intractable for the time being. 
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There is however a pleasant surprise. About ten years of work on string 
cosmology have led naturally to considering a scenario -the so-called pre- 
Big Bang (PBB) scenario- in which the Universe enjoyed a long perturbative 
“life” before the Big Bang. Starting from an almost trivial state (asymp- 
totic past triviality, see Sect. 3), the Universe would have evolved towards 
stronger and stronger curvature and coupling, thereby inflating, until it en- 
tered the non-perturbative phase that replaces the Big Bang singularity of 
more standard cosmological models. 

The situation is very much reminiscent of QCD and strong interactions. 
Perturbative QCD has been very successful in predicting a huge number of 
observables for short-distance-dominated hard processes. Successes in the 
non-perturbative, large-distance regime have been meagre, by comparison: 
we still lack a definitive proof of confinement, of spontaneous chiral symme- 
try breaking, of explicit U(1)a breaking, etc. Yet, we do believe that QCD 
is the correct description of hadronic physics down to scales of 10“^® cm or 
so. This is largely based on the belief that large- and short-distance physics 
“decouple”, e.g. on the assumption that the soft hadronization process does 
not affect certain infrared-safe quantities computed at the quark-gluon level. 
Fortunately, we did not wait until the confinement problem was solved, to 
take QCD seriously! 

A very similar attitude will be defended here in the case of string cosmol- 
ogy, with one amusing twist: large- and short-distance physics get somehow 
swapped as we go from QCD to gravity/cosmology. Figure 1 (from [6]) 
illustrates this point. The easy regime for gravity is at large distance/small 
curvatures; the tough one turns out to be the high-curvature regime that 
replaces here the Big Bang singularity. Yet, we shall argue that some con- 
sequences of string cosmology, those related to scales that were very large 
with respect to the string scale in the high-curvature regime, should not be 
affected, other than by a trivial kinematical red-shift, by the details of the 
pre- to post-Big Bang transition... provided, of course, that such a transi- 
tion does indeed take place (the counterpart to assuming that confinement 
does occur in QCD). 

The above reasoning does not imply, of course, that one should not 
address the hard questions now. On the contrary, the easy part of the 
game will give precious information on what the relevant hard questions are 
(for cosmology) and on how to formulate them. I have already mentioned 
an example of what I mean: insisting too much on (extended) SUSY vacua 
appears to be an unacceptable limitation for the problems at hand. Another 
example is that of demanding stability of an acceptable string vacuum: we 
shall see (in Sect. 4) that inflationary string vacua lead to tachyonic, i.e. 
to growing rather than to oscillating, modes. Such modes appear to horrify 
most string theorists; however, they are just what inflationary cosmologists 
happily use all the time in order to generate large-scale structure (LSS), and 
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Fig. 1. 



what PBB cosmology uses to generate heat and entropy from an initially 
cold Universe (see Sect. 5). 

A completely different criticism of string cosmology comes from the 
cosmology end: for someone accustomed to a data-driven “bottom-up” 
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approach, string cosmology is too much “top-down” . There is certainly 
a point here. I do not believe that a good model of cosmology is likely to 
emerge from theoretical considerations alone. Input from the data will be 
essential in the selection among various theoretical alternatives. We shall 
see explicit examples of what I mean in Section 5. Yet, it appears that 
a combination of top-down and bottom-up would be highly desirable. If 
past history can teach us something in this respect, the construction of the 
standard model of particle physics (and of QCD in particular) is a perfect 
example of a fruitful interplay of theoretically sound ideas and beautiful 
experimental results. Cosmology today resembles the particle physics of 
the sixties: interesting new data keep coming in at a high pace, while com- 
pelling theoretical pillars on which to base our understanding of those data 
are still missing. 

As a final remark, let me turn things around and claim that cosmol- 
ogy could be the only hope that we have for testing string theory in the 
foreseeable future by using the cosmos itself as the largest conceivable ac- 
celerator. The cosmological red-shift since the Big Bang has kindly brought 
down Planck-scale physics to a macroscopic scale, thus opening for us a 
window on the very early Universe. As we shall see in Section 2.3, even in 
this respect, standard and PBB inflation are markedly different. 

2.2 Why/which inflation? 

The reasons why the standard hot-Big-Bang model is unsatisfactory have 
been repeatedly discussed in the literature. For details, we refer to two 
excellent reviews [7]. Let me briefly summarize here the basic origin of 
those difficulties with the simplest Friedmann-Robertson- Walker (FRW) 
cosmology. 

In the FRW framework the size of the (now observable) Universe was 
about 10“^ cm at the start of the classical era, say at t ~ a few times tp, 
where tp ~ 10“^^ s is the so-called Planck time. This is of course a very tiny 
Universe w.r.t. its present size (~ 10^® cm), yet it is huge w.r.t. the horizon 
(the distance travelled by light) at that time, i.e. to Ip = ctp ~ 10“®® cm. 
In other words, a few Planck times after the Big Bang, our observable 
Universe was much too large! It consisted of (10®®)® = 10®® Planckian-size, 
causally disconnected regions. There had not been, since the beginning, 
enough time for the Universe to become homogeneous {e.g. to thermalize) 
over its entire size. Also, soon after t = tp, the Universe was characterized 
by a huge hierarchy between its Hubble radius on one side and its spatial- 
curvature radius on the other. The relative factor of (at least) 10®® appears 
as an incredible amount of fine-tuning on the initial state of the Universe, 
corresponding to a huge asymmetry between time and space derivatives. 
Was this asymmetry really there? And, if so, can it be explained in any, 
more natural way? 
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It should be stressed that, while the above unexplained ratio becomes 
larger and larger as we approach the Planck time (and would go to infinity 
at t = 0 if we could trust the equations throughout), it represents the ratio 
of two classical length scales. It so happens that one of the two lengths 
becomes the (quantum) Planck scale at t = tp, but the ratio is still huge at 
much later times when both scales have nothing to do with (and are much 
larger than) tp. This comment will be very relevant to the discussion of 
fine-tuning issues given in Section 3.6. 

It is well known that a generic way to wash out inhomogeneities and 
spatial curvature consists in introducing, in the history of the Universe, a 
long period of accelerated expansion, called infiation [7] . This still leaves two 
alternatives: either the Universe was generic at the Big Bang and became 
fiat and smooth because of a long post-bangian inflationary phase; or it was 
already fiat and smooth at the Big Bang as a result of a long pre-bangian 
inflationary phase. 

Assuming, dogmatically, that the Universe (and time itself) started at 
the Big Bang, leaves only the first alternative. However, that solution has 
its own problems, in particular those of fine-tuned initial conditions and 
infiaton potentials. Besides, it is quite difficult [8] to base standard infiation 
in the only known candidate theory of quantum gravity, superstring theory. 
Rather, as we shall argue in a moment, superstring theory gives strong hints 
in favour of the second (pre-Big Bang) possibility through two of its very 
basic properties, the first in relation to its short-distance behaviour, the 
second from its modifications of GR even at large distance. 

2.3 Superstring-inspired cosmology 

As just mentioned, two classes of properties of string theory are relevant for 
cosmology. Let us discuss them in turn. 

A) Short-distance properties 

Since the classical (Nambu-Goto) action of a string is proportional to the 
area A of the surface it sweeps, its quantization must introduce a quantum 
of length As through: 

S/n = AI\l. (2.1) 

This fundamental length, replacing Planck’s constant in quantum string 
theory [9], plays the role of a minimal observable length, of an ultraviolet 
cut-off. Thus, in string theory, physical quantities are expected to be bound 
by appropriate powers of As, e.g. 

^ Gp < A"2 

kpT /Ti < cAg ^ 

Rcomp > As . (2.2) 
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In other words, in quantum string theory, relativistic quantum mechanics 
should solve the singularity problems in much the same way as 
non-relativistic quantum mechanics solved the singularity problem of the 
hydrogen atom by keeping the electron and the proton a finite distance 
apart. By the same token, string theory gives us a rationale for asking dar- 
ing questions such as: what was there before the Big Bang? Certainly, in 
no other present theory can such a question be meaningfully asked. 

B) Large-distance properties 

Even at large distance (low-energy, small curvatures), superstring theory 
does not automatically give Einstein’s GR. Rather, it leads to a scalar- 
tensor theory of the JBD variety. The new scalar particle/field 4>, the so- 
called dilaton, is unavoidable in string theory, and gets reinterpreted as 
the radius of a new dimension of space in so-called M-theory [10]. By 
supersymmetry, the dilaton is massless to all orders in perturbation theory, 
i.e. as long as supersymmetry remains unbroken. This raises the question: 
Is the dilaton a problem or an opportunity? My answer is that it could be 
both; and while we can try to avoid its potential dangers, we may try to 
use some of its properties to our advantage... Let me discuss how. 

In string theory, (j) controls the strength of all forces [11], gravitational 
and gauge alike. One finds, typically: 

^p/^s ~ C>^gauge ~ , (2.3) 

showing the basic unification of all forces in string theory and the fact that, 
in our conventions, the weak-coupling region coincides with ^ <C — 1. In 
order not to contradict precision tests of the equivalence principle, and of 
the constancy of the gauge and gravitational couplings in the “recent” past, 
we require [12] the dilaton to have a mass (see, however [13] for an amusing 
alternative) and to be frozen at the bottom of its own potential today. This 
does not exclude, however, the possibility of the dilaton having evolved 
cosmologically (after all, the metric did!) within the weak coupling region 
where it was practically massless. The amazing (yet simple) observation [14] 
is that, by so doing, the dilaton may have inflated the Universe! 

A simplified argument, which, although not completely accurate, cap- 
tures the essential physical point, consists in writing the Friedmann equation 
(for a spatially flat Universe): 



= 8nGp , (2.4) 

and in noticing that a growing dilaton (meaning through (2.3) a growing 
G) can drive the growth of H even if the energy density of standard matter 
decreases in an expanding Universe. This new kind of inflation (character- 
ized by growing H and (jj) has been termed dilaton-driven inflation (DDI). 
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The basic idea of pre-Big Bang cosmology [14-17] is thus illustrated in 
Figure 2: the dilaton started at very large negative values (where it was 
practically massless), ran over a potential hill, and finally reached, some- 
time in our recent past, its final destination at the bottom of its potential 
= (po). Incidentally, as shown in Figure 2, the dilaton of string theory 
can easily roll-up -rather than down- potential hills, as a consequence of 
its non-standard coupling to gravity. 



V(W 




Fig. 2. 



DDI is not just possible. It exists as a class of (lowest-order) cosmological 
solutions thanks to the duality symmetries of string cosmology [14, 18, 19]. 
Under a prototype example of these symmetries, the so-called scale-factor 
duality (SFD) [14,18], a FRW cosmology evolving (at lowest order in deriva- 
tives) from a singularity in the past is mapped into a DDI cosmology going 
towards a singularity in the future. Of course, the lowest order approxima- 
tion breaks down before either singularity is reached. A (stringy) moment 
away from their respective singularities, these two branches can easily be 
joined smoothly to give a single non-singular cosmology, at least mathe- 
matically. Leaving aside this issue for the moment (see Sect. 5 for more 
discussion), let us go back to DDI. Since such a phase is characterized 
by growing coupling and curvature, it must itself have originated from a 
regime in which both quantities were very small. We take this as the main 
lesson/hint to be learned from low-energy string theory by raising it to the 
level of a new cosmological principle, that of “Asymptotic Past Triviality” , 
to be discussed in the next Lecture. 
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2.4 Explicit solutions 

Many explicit exact PBB-type solutions to the low-energy effective action 
equations have been constructed and discussed in the literature. For an 
excellent review, see [1]. Exact solutions can only be obtained in the pres- 
ence of symmetries (isometries) and, although they are heuristically very 
important, they are too special from the point of view of an inflationary 
cosmology, which, as such, should not accept fine-tuned initial conditions. 
This is why we shall not go into an exhaustive discussion of explicit solu- 
tions here. Instead, in Section 3, we will adress the general problem of the 
evolution of asymptotically trivial initial data. 

Here we shall limit our attention to the simplest Bianchi I-type solutions 
and to their quasi-homogeneous generalizations, after recalling that many 
more solutions can be obtained from the former by using the non-compact 
0{d, d) symmetry of the low-energy string-cosmology equations [19] when 
the Kalb-Ramond (KR) field is turned on, or by S-duality transforma- 
tions (see e.g. [1]) generating a homogeneous axion field (related to by 
yet another duality transformation). 

The generic homogeneous Bianchi I solution with = 0 reads, for 
t < 0, 



ds2 



1 



-df2 + ^(-t)2“ida;Mx* , 

i 




log(-t) 



E 



a,- 



(2.5) 



i.e. represents a generalization of the well-known Kasner solutions (see 
e.g. [20]) in which one of the two Kasner constraints (the one linear in the 
oii) is replaced by the equation giving the time dependence of 4> {4> is absent, 
or constant, for Kasner, hence the second constraint). 

Note that, unlike Kasner’s (2.5) allows for isotropic solutions (a^ = 
±1/Vd for all i). Also, the quadratic Kasner constraint automatically has 
2'^ SFD-related branches, obtained by changing the sign of any subset of 
the a's. Also note that the so-called shifted dilaton defined by: 

ilog (det ffy) , (2.6) 

which is invariant under the full 0{d, d) group, is always given by: 



0 = -log(-t) . 



(2.7) 
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A quasi-homogeneous generalization of (2.5) was first discussed in [21] (see 
also [22]) and reads: 



ds^ = 



-df2 + e“(x) e“(cc)(-t)2““(^)da;Mx^ 

a 

- log(-t) 



1 = ^a^(x), t<0 



( 2 . 8 ) 



where x stands for the space coordinates. Equation (2.8) can be shown to 
be a generic asymptotic solution of the full PDEs near the t = 0 singu- 
larity where spatial gradients become less and less important w.r.t. time 
derivatives, justifying the validity of the so-called gradient expansion [23]. 
Note that equation (2.7) is not modified in the quasi-homogeneous solutions. 
Besides allowing isotropic cosmologies in the homogeneous case, the pres- 
ence of the dilaton also removes the necessity of a chaotic (BKL-type [24]) 
behaviour near the singularity [25]. 



2.5 Phase diagrams and Pen rose-style overview 

It is useful to visualize the PBB scenario with the help of some diagrams. 
Since the actual phase space of the model is multidimensional, each of these 
diagrams necessarily represents just a cross section of the complete picture. 

A very commonly used diagram (Fig. 3) is the flow-diagram in the 0, H 
plane (time being just a parameter along the flow lines). Since, at lowest 
order, 0 > 0, the flow is always from left to right near the origin. The four 
straight lines represent the four (isotropic for simplicity) solutions connected 
by SFD and time-reversal. The product of the two transformations repre- 
sents the physically interesting case, since it maps ordinary decelerating 
FRW cosmology (top left) to dilaton-driven inflation (top right). Clearly, 
our scenario needs a high-curvature phase during which the left-to-right 
flow is inverted (as shown by the dotted line joining the two perturbative 
branches). This can only happen as the result of higher-order corrections 
(see Sect. 5). 

A second useful diagram (Fig. 4) is the e^,H plot, i.e. the curvature 
(energy) coupling plane. The fully perturbative domain (where evolution 
starts according to the APT postulate) lies, in a log-log plot, to the far 
left-bottom corner. Sticking again, for simplicity, to the isotropic case, DDI 
evolution is represented by parallel lines distinguished by different initial 
values of the dilaton {i.e. of the coupling). It is clear that all these solutions 
run, eventually, into strong curvature or strong coupling (shown as thick 
solid lines), which one is hit first being determined by the above-mentioned 
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contracting 

Universes 



Fig. 3. 




Fig. 4. 



initial coupling. A discussion of what might happen afterwards is given in 
Section 5. 

As a third possibility, let us use a Carter-Penrose style plot [26] (Fig. 5) 
to represent, on a finite piece of paper, the entire evolution of the Universe. 
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Unlike in ordinary cosmology, where the CP diagram is truncated by the 
(space-like) hypersurface of the Big-Bang singularity, here the whole CP 
diagram, going from past to future time-like and null infinities, is physically 
meaningful because of our assumption that finite-string-size effects remove 
the Big Bang singularity. This diagram will be discussed and used in the 
following sections. 



1 + 




Fig. 5. 

Finally, let us represent the basic difference between the standard infla- 
tion scenario and that of PBB cosmology by plotting, for each cosmological 
model, the Hubble horizon (H~^) and the physical scale that coincides with 
it today, as functions of cosmic time. This gives rise to two “wine glasses” 
(Fig. 6), which are very similar in their upper parts (corresponding to recent 
epochs) but differ markedly at very early times. The most salient difference 
appears in the early behaviour of the Hubble horizon, an increasing func- 
tion of time in the standard inflation, a decreasing one in the PBB case. 



Time Time 
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Fig. 6. (a) Standard inflation’s wineglass; (b) Pre-Big Bang’s wineglass. 
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The figure allows me to stress one phenomenological advantage of PBB in- 
flation: Planck- (or string)-scale physics, being no longer washed out by a 
long, subsequent inflationary phase, becomes accessible to present (or near- 
future) experiments at the millimetre (100 GHz) scale. At the same time, 
larger-scale experiments (such as those on small-angle CMB anisotropies) 
will test (sub-Planckian-energy) physics during the pre-bangian phase. By 
contrast, as we have already mentioned in the Introduction, in standard 
inflation large-scale data probe the Universe as it was seventy e-folds or so 
before the end of inflation, while shorter scales tells us about more recent 
epochs. Since we know that, seventy e- folds before the end of inflation, iJinS 
was less than 10“®Mp (or else excessive large scale anisotropies are created, 
see Sect. 4), and that such a scale slowly decreases during (slow-roll) in- 
flation, it is clear that, according to standard inflation, physics at energies 
larger than 10“^Mp remains unaccessible. 

3 How could it have started? 

3.1 Generic asymptotically trivial past 

We have already mentioned that, in standard non-infiationary cosmology, 
initial conditions have to be fine-tuned to incredible accuracy in the far past 
{i.e. at t ~ tp ~ 10“^^ s). What does this fine-tuning problem look like if 
we accept hints from scale-factor duality and assume asymptotically trivial, 
yet generic, initial conditions? 

The concept of asymptotic past triviality (APT) is quite similar to that 
of “asymptotic flatness”, familiar from general relativity [27]. The main 
differences consist in making only assumptions concerning the asymptotic 
past (rather than future or space-like infinity) and in the additional pres- 
ence of the dilaton. It seems physically (and philosophically) satisfactory 
to identify the beginning with simplicity (see e.g. the entropy-related argu- 
ments given in Sect. 5.7). What could be simpler than a trivial, empty and 
fiat Universe? Nothing, of course! The problem is that such a Universe, be- 
sides being uninteresting, is also non-generic. By contrast, asymptotically 
fiat/trivial Universes are initially simple, yet generic, in a precise mathe- 
matical sense that we shall now discuss. 

From the point of view of space-time (taken here, for simplicity, to 
be (3 -|- 1 (-dimensional) the generic solution depends upon four arbitray 
functions of three coordinates [28] related to the metric, plus two more 
each for the dilaton and the KR field Amusingly, there is an ex- 

act correspondence between this “target-space” counting and a “world- 
sheet” counting. In the latter, those eight arbitrary functions correspond 
to eight arbitrary functions of three-momentum entering the most general 
physical {i.e. on shell) vertex operator describing gravitons, dilatons, and 
the KR field (which, in four dimensions, is equivalent to a pseudoscalar. 
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the KR axion). We will see in Section 3.4 how these arbitrary functions 
appear in the asymptotic expansion of our fields. 

Can a very rich and complicated Universe, like our own, emerge from 
such extremely simple initial conditions? This would look much like a mir- 
acle. However, as I shall argue below, this is precisely what should be ex- 
pected, owing to well-known classical and quantum gravitational 
instabilities. 



3.2 The asymptotic past's effective action and different (conformal) frames 

The APT postulate implies that the early-time evolution of the Universe 
can be described in terms of the low-energy tree- level action of string theory. 
Taking a generic closed superstring theory, this reads: 



T,s = Xl-<^J d'^+ix/Me-^ (^R + gf^-'d^^d,c(-^{dBy-2Ay 

(3.1) 

where di? is the (three- form) field strength associated with 

A further simplification comes from assuming to be dealing with so-called 
critical superstring theory, the case in which the tree-level (and actually the 
all-order perturbative) cosmological constant A vanishes. This requires a 
total of H = 10 space-time dimensions. If D yf 10 there will be an effec- 
tive cosmological constant 0(A“^) preventing any low-curvature solution of 
the field equations to exist. A similar conclusion is reached if we consider 
critical, but non-supersymmetric, string theories (see Sect. 3.3). 

Equation (3.1) receives corrections when curvatures become 0(A“^) or 
when the coupling becomes 0(1). If such corrections are both negligible, 
it sometimes becomes useful to perform a change of variable by going to the 
so-called Einstein frame (not to be confused with different frames in GR). 
This is done by defining: 



= 9 






(3.2) 



It is relatively easy to rewrite the action (3.1) using the Einstein metric. 
The result is simply: 



(3.3) 

where Ip ^ = e'^^Af ^ is the present value of the Planck length. 

Although the use of the Einstein frame could simplify some calculations, 
and we shall see examples of this below, it should be kept in mind that 
the form of the corrections is no longer so simple. For instance, higher- 
derivative corrections become important when the Einstein-frame curvature 
is 0{lp^e~'^'^ = A“^), i.e. reaches a dilaton-dependent critical value. 
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Similarly, having a constant Newton “constant” in this frame is a mere 
illusion because (even tree-level) string masses do now depend upon (j). For 
these reasons, although physical results are frame-independent, we shall 
always describe them with reference to the original string-frame metric in 
which the stringh length As is constant. 

Let us finally remark that the two frames have been made to coincide 
today, with the dilaton fixed at its present value (j)o. Similarly, the assump- 
tion of APT would also allow the identification of the two frames in the far 
past, since the dilaton approaches a constant as t ^ — oo. However, the two 
Einstein frames that coincide with the string frame at t = ±oo differ from 
each other by an enormous conformal factor, i.e. by a huge blowing-up of 
all physical scales. 

3.3 Classical asymptotic symmetries: The importance of SUSY 

The classical equations that follow from varying (3.1) or (3.3), besides be- 
ing generally covariant, are also invariant under a two-parameter group of 
(global) transformations acting as follows: 

(f) ^ + c , 

>?g^Lv ■ (3.4) 

Indeed (3.1, 3.3) are simply rescaled by a constant factor under this group. 
These two symmetries depend crucially on the validity of the tree-level 
low-energy approximation and on the absence of a cosmological constant. 
Loop corrections clearly spoil invariance under dilaton shifts, while lower 
derivatives (a cosmological constant) or higher derivatives (a') corrections 
spoil invariance under a rescaling of the metric. Note that, using general 
covariance, the latter symmetry is equivalent to an overall rescaling of all 
the coordinates. The relevance of the two classical symmetries on the issue 
of fine-tuning will become obvious in the next two subsections. 

The importance of dealing with critical superstring theory now becomes 
evident: if one would consider non-supersymmetric string theories, a cosmo- 
logical constant would almost certainly be generated at some finite order of 
the loop expansion: this would change completely the large-distance prop- 
erties and spoil the symmetries of the field equations. 

3.4 Dilaton-driven inflation as gravitational collapse 

For simplicity, we will only illustrate here the simplest case of gravi-dilaton 
system already compactified to four space-time dimensions. Through the 
field redefinition (3.2), our problem is reduced to the study of a massless 
scalar field minimally coupled to gravity. It is well known that such a form of 
matter cannot give inflation (since it has positive pressure). Instead, it can 
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easily lead to gravitational collapse (GC). Thus, in the Einstein frame, the 
problem becomes that of finding out under which conditions gravitational 
collapse occurs if asymptotically-trivial initial data are assigned. Gravita- 
tional collapse usually means that the (Einstein) metric (hence the volume 
of 3-space) shrinks to zero at a space-like singularity. However, typically, 
the dilaton blows up at that same singularity. Given the relation (3.2) be- 
tween the Einstein and the (physical) string metric, we can easily imagine 
that the latter blows up near the singularity, as implied by DDL 

How generically does GC happen? Let us recall the singularity theorems 
of Hawking and Penrose [29], which state that, under some general assump- 
tions, singularities are inescapable in GR. Looking at the validity of those 
assumptions in the case at hand, one finds that all but one are automati- 
cally satisfied. The only condition to be imposed is the existence of a closed 
trapped surface (GTS) (a closed surface from which future light cones lie 
entirely in the region inside the surface). Rigorous results [30] show that 
this condition cannot be waived: sufficiently weak initial data do not lead to 
closed trapped surfaces, to collapse, or to singularities. Sufficiently strong 
initial data do. But where is the border-line? This is not known in general, 
but precise criteria do exist for particularly symmetric space-times, e.g. for 
those endowed with spherical symmetry (see Sect. 3.6). 

However, no matter what the general collapse/singularity criterion will 
eventually turn out to be, we do know, from the classical symmetries de- 
scribed in the previous subsection, that such a criterion cannot depend 

• on an over-all additive constant in <f>, or 



• on an over-all multiplicative factor in 

A characterization of APT initial data can be made [31] following the pi- 
oneering work [27] of Bondi, Sachs, Penrose, and others. Since our initial 
quanta are assumed to consist of massless gravitons and dilatons, their past 
infinity is null: it is the famous 2~ of the Penrose diagram (Fig. 5). APT 
means that dilaton and metric can be expanded near I~ in inverse powers 
of r ^ oo, while advanced time v and two angular variables, 9 and (p, are 
kept fixed. We shall thus write: 



= 4)0 + 



r 



+ o 




(3.5) 






(3.6) 



The null wave data on X~ are: the asymptotic dilatonic wave form f(v, 9, tp), 
and two polarization components, /+(w, 9, (p) and /x (u, 0, (p), of the asymp- 
totic gravitational wave form f ^i,(y , 9 , ip) , whose other components can be 
gauged away. The three functions /, /+, /x oi v, 9, ip are equivalent to six 
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functions of r, 9, (p with r > 0, because the advanced time v ranges over the 
full line (— oo,+oo). This is how the six arbitrary functions of the generic 
solution to the gravi-dilaton system are recovered. 

Of particular interest here are the so-called News functions, simply given 

by 

N{v, 9, if) = dy f{v, 9, If) , N+ = dy /+ , iVx = dy /x , (3.7) 

and the “Bondi mass” given by: 

M_(v) 

Qvv 

The Bondi mass and the News are connected by the energy-momentum 
conservation equation, which tells us that the advanced-time derivative of 
M-(v) is positive-semidefinite and related to incoming energy fluxes con- 
trolled by the News: 

dM_(v)/du = i j {N‘^ + Nl+ N^) . (3.9) 

The physical meaning of M_(u) is that it represents the energy brought into 
the system (by massless sources) by advanced time v. In the same spirit one 
can define the Bondi mass M^(u) at future null infinity X'*". It represents 
the energy still present in the system at retarded time u. If only massless 
sources are present, the so-called ADM mass is given by 

M_(-|-oo) = M+(— oo) = Madm , (3.10) 

while M_(— oo) = 0, and M_|_(-|-oo) = Me represents the mass that has not 
been radiated away even after waiting an infinite time, i.e. the mass that 
underwent gravitational collapse [32]. Collapse (resp. no-collapse) criteria 
thus aim at establishing under which initial conditions one expects to find 
Me > 0 (resp. Me = 0). 

Since, as we shall see in the particular case of spherical symmetry, col- 
lapse criteria i) do not involve any particularly large number, and ii) do not 
contain any intrinsic scale but just dimesionless ratios of various classical 
scales, we expect i) gravitational collapse to be quite a generic phenomenon 
and ii) that nothing, at the level of our approximations, will be able to 
fix either the size of the horizon or the value of 4> at the onset of collapse. 
Generically, and quite randomly and chaotically, some regions of space will 
undergo gravitational collapse, will form horizons and singularities therein. 
When this is translated into the string frame, the region of space-time within 
the horizon undergoes a period of DDI in which both the initial value of the 
Hubble parameter and that of (j) are left arbitrary. In the next subsection 
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we shall see that such arbitrariness provides an answer to the fine-tuning 
allegations that have been recently moved [33] to the PBB scenario. This 
section will be concluded with a discussion of how more precisely the case 
of spherical symmetry can be dealt with. 



3.5 Fine-tuning issues 

The two arbitrary parameters discussed in the previous subsection are very 
important, since they determine the range of validity of our description. In 
fact, since both curvature and coupling increase during DDI, the low-energy 
and/or tree-level description is bound to break down at some point. The 
smaller the initial Hubble parameter (z. e. the larger the initial horizon size) 
and the smaller the initial coupling, the longer we can follow DDI through 
the effective action equations and the larger the number of reliable e-folds 
we shall gain. 

This does answer, in my opinion, the objections raised recently [33] to 
the PBB scenario according to which it is fine-tuned. The situation here 
actually resembles that of chaotic inflation [34] . Given some generic (though 
APT) initial data, we should ask which is the distribution of sizes of the 
collapsing regions and of couplings therein. Then, only the “tails” of these 
distributions, i.e. those corresponding to sufficiently large, and sufficiently 
weakly coupled, regions will produce Universes like ours, the rest will not. 
The question of how likely a “good” Big Bang is to take place is not very 
well posed and can be greatly affected by anthropic considerations [31]. 

In conclusion, we may summarize recent progress on the problem of 
initial conditions by saying that [31]: 

Dilaton-Driven Inflation in String Cosmology 
is as generic as 

Gravitational Collapse in General Relativity. 

Furthermore, asking for a sufficiently long period of DDI amounts to setting 
upper limits on two arbitrary moduli of the classical solutions. 

Figure 7 (from Ref. [31]) gives a (2-|- l)-dimensional sketch of a possible 
PBB Universe: an original “sea” of dilatonic and gravity waves leads to 
collapsing regions of different initial size, possibly to a scale-invariant dis- 
tribution of them. Each one of these collapses is reinterpreted, in the string 
frame, as the process by which a baby Universe is born after a period of 
PBB inflationary “pregnancy”, the size of each baby Universe being deter- 
mined by the duration of the corresponding pregnancy, i.e. by the initial 
size of (and coupling in) the corresponding collapsing region. Regions ini- 
tially larger than 10“^^ cm can generate Universes like ours, smaller ones 
cannot . 
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Fig. 7. 



A basic difference between the large numbers needed in (non- inffation- 
ary) FRW cosmology and the large numbers needed in PBB cosmology 
should be stressed. In the former, the ratio of two classical scales, e.g. 
of total curvature to its spatial component, which is expected to be 0(1), 
has to be taken as large as 10®°. In the latter, the above ratio is initially 
0(1) in the collapsing/inffating region, and ends up being very large in that 
same region, thanks to DDL However, the (common) order of magnitude of 
these two classical quantities is a free parameter, and it is taken to be much 
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larger than the classically irrelevant quantum scale. Indeed, the smallness 
of quantum corrections (which would introduce a scale in the problem) was 
explicitly checked in [35] . 

We can visualize analogies and differences between standard and pre- 
Big Bang inflation by looking again at Figures 6a and 6b. The common 
feature in the two pictures is that the fixed comoving scale corresponding 
to the present horizon was “inside the horizon” for some time during infla- 
tion, possibly very deeply inside at its onset. The difference between the two 
scenarios is just in the behaviour of the Hubble radius during inflation: in- 
creasing in standard inflation (a), decreasing in string cosmology (b). Thus, 
while standard inflation is still facing the initial-singularity question and 
needs a non-adiabatic phenomenon to reheat the Universe (a kind of small 
bang), PBB cosmology faces the singularity problem later, combining it 
with the exit and heating problems (see Sect. 5). 



3.6 The spherically symmetric case 



In the spherically symmetric case many authors have studied the problem of 
gravitational collapse of a minimally coupled scalar field both numerically 
and analytically. In the former case I will only mention the well-known 
results of Choptuick [36], pointing at mysterious universalities near critical 
collapse {i.e. at the border-line situation in which the collapse criteria are 
just barely met). In this case, a very small black hole forms. This is not 
the case we are really interested in for the reasons we just explained. We 
shall thus turn, instead, to what happens when the collapse criteria are 
largely fulfilled. For this we make use of the rather powerful results due to 
Christodoulou over a decade of beautiful work [30,32,37,38]. 

There are no gravitational waves in the spherically symmetric case so 
that null wave data consist of just an angle-independent asymptotic dila- 
tonic wave form f{v), with the associated scalar News N{v) = f'{v). 

A convenient system of coordinates is the double null system, (u,v), 
such that 

(j)= (j){u,v) , (3.11) 

ds^ = v) dudv + v) dw^ , (3.12) 

where duj^ = d6^ + sin^ 6 dip^ . The field equations are conveniently re- 
expressed in terms of the three functions 4>{u, v), r{u, v) and m{u, v), where 
the local mass function m(u, v) is defined by: 



, 2m , 4 dr dr 

1 - — = ( 3 , 13 ) 

One gets the following set of evolution equations for m, r and (j) 
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^dr dm f 2m\ / 19(/)\ 

dv dv \ r ) 4 \dv ) ’ 

d^r 2m dr dr 

dudv r — 2m du dv ’ 
d‘^4> dr d(j) dr d(j) 

^ dudv du dv dv du ^ ’ 

The quantity 

, , 2m(u, v) 

fi{u,v) = (3.18) 

r 

plays a crucial role in the problem. If stays everywhere below 1, the 
field configuration will not collapse but will finally disperse at infinity as 
outgoing waves. By contrast, if the mass ratio fi can reach anywhere the 
value 1, this signals the formation of an apparent horizon A. The location 
of this apparent horizon is indeed defined by the equation 



(3.15) 

(3.16) 

(3.17) 



A : /i(u, u) = 1 . 



(3.19) 



The above statements are substantiated by some rigorous inequalities [38] 
stating that: 



dr 

du 

dr 

dv 



< 0 , 

(1-Ai) >0, 



dm 

dv 

dm 

du 



> 0 , 



(1-A^) <0. 



(3.20) 

(3.21) 



Thus, in weak-field regions (^ < 1), d^r > 0, while, as /x > 1, < 0, mean- 

ing that the outgoing radial null rays (“photons”) emitted by the sphere 
r = const become convergent, instead of having their usual behaviour. This 
is nothing else but the signature of a CTS! 

In the case of spherical symmetry, it has been possible to prove [37] that 
the presence of trapped surfaces implies the existence of a future singular 
boundary B of space-time where a curvature singularity occurs. Further- 
more, the behaviour of various fields near the singularity is just that of a 
quasi-homogeneous DDI as described by equations (2.8)! This highly non- 
trivial result strongly supports the idea that PBB inflation in the string 
frame is the counterpart of gravitational collapse in the Einstein frame. 

Reference [37] gives the following sufficient criterion on the strength of 
characteristic data, considered at some finite retarded time u 



2Am 

> 

Ar 




(3.22) 



where ri < T 2 , with V 2 < 3ri/2, are two spheres, Ar = V 2 — ri is the width 
of the “annular” region between the two spheres, and Am = m2 — m\ = 
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m{u,r 2 ) — m{u,ri) is the mass “contained” between the two spheres, i.e. 
more precisely the energy flux through the outgoing null cone u = const, 
between ri and r 2 . Note the absence of any intrinsic scale (in particular 
of any short-distance cut-off) in the above criterion. The theorem proved 
in [37] is not exhausted in the above statement. It contains various bounds 
as well, e.g. 

• an upper bound on the retarded time at which the CTS (i.e. a horizon) 
is formed; 

• a lower bound on the mass, i.e. on the radius of the collapsing region. 

The latter quantity is very important for the discussion of the previous 
subsection since it gives, in the equivalent string-frame problem, an upper 
limit on the Hubble parameter at the beginning of DDL Such an upper limit 
depends only on the size of the advanced-time interval satisfying the CC; 
since the latter is determined by the scale-invariant condition (3.22), the 
initial scale of inflation will be classically undetermined. 

The above criterion is rigorous but probably too conservative. It also 
has the shortcoming that it cannot be used directly on X“, since u -s- —oo 
on T~ . In reference [31] a less rigorous (or less general) but simpler criterion 
directly expressible in terms of the News (i.e. on T~) was proposed on the 
basis of a perturbative study. It has the following attractive form: 

sup Var(iV(a;)),rebi.« 2 ] >C = 0(1/4) , (3.23) 

VI ,V2 
vi<V2 

where: 

Var (iV(x)),^g[„j,„,] = {{N{x) - (iV)[„i.« 2 ])^)a;ebi,t- 2 ] • (3-24) 

Thus Var ((/)[„j „ 2 ] denotes the “variance” of the function g(x) over the in- 
terval [vi,V 2 ], i.e. the average squared deviation from its mean value. 

According to this criterion the largest interval satisfying (3.23) deter- 
mines the size of the collapsing region and thus, through the collapse in- 
flation connection, the initial value of the Hubble parameter. It would be 
interesting to confirm the validity of the above criterion and to determine 
more precisely the value of the constant appearing on its r.h.s. through more 
analytic or numerical work. Actually, numerical studies of spherically sym- 
metric PBB cosmologies have already appeared [39], while more powerful 
numerical codes should soon be available [40]. 

4 Phenomenological consequences 

4.1 Cosmological amplification of vacuum fluctuations: General properties 

I will start by recalling the basic physical mechanism underlying particle 
production in cosmology (for a nice review, see [41]) and by introducing 
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the corresponding (and by now standard) jargon. By the very definition of 
infiation (o > 0) physical wavelengths are stretched past the Hubble scale 
(H~^) during infiation. After the end of infiation each wavelength grows 
slower than H~^ and thus “re-enters” the horizon. Obviously, the larger 
the scale the earlier it crosses the horizon outward and the later it crosses it 
back inward. Hence larger scales “spend” more time “outside the horizon” 
than smaller ones. 

The attentive reader may worry at this point about the way this descrip- 
tion applies when distances are measured using the Einsten- frame metric. 
As we have seen in the previous section, PBB infiation corresponds to ac- 
celerated contraction in the Einstein frame. Nonetheless, one can show that 
physical quantities (that is, typically, dimensionless ratios of physical quan- 
tities) do not depend on the choice of the frame: after all, changing frame 
is nothing more than a local field-redefinition, which is known not to affect 
the physics. It is amusing to notice, for instance, that physical wavelengths 
go outside the horizon during the Einstein-frame equivalent of DDL Indeed, 
although physical EE scales shrink during the collapse, the horizon H~^ 
shrinks even faster! I refer to the first paper in [16] for further discussion 
on this point. 

Consider now a generic perturbation 'L on top of a homogeneous back- 
ground, which includes a cosmological-type metric, a dilaton, and, possibly, 
other fields, such as another infiaton field, an axion, etc. Since 'L = 0 is, by 
definition of a perturbation, a classical solution, 'L intself enters the effec- 
tive low-energy action quadratically. Soon after the beginning of infiation 
the background itself becomes homogeneous, isotropic, and spatially fiat, so 
that the perturbed action takes the generic form: 

I=^J drjd^x (4.1) 

Here 77 is conformal-time (odry = dt), and a prime denotes d/drj. The func- 
tion S{r]) (sometimes called the “pump” field) is, for any given 47, a given 
function of the scale factor 0 ( 77 ), and of other scalar fields (four-dimensional 
dilaton 4>(rf), moduli ^ 7 ( 77 ), etc.), which may appear non-trivially in the 
background. 

While it is clear that a constant S may be reabsorbed by rescaling 47, and 
is thus ineffective, a time-dependent S couples non-trivially to 47 and leads 
to the production of pairs of quanta (with equal and opposite momenta). 
In order to see this, it is useful to go over to a Hamiltonian description of 
the perturbation and of its canonically conjugate momentum H: 
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The Hamiltonian corresponding to (4.1) is thus given by 




d^a 






S'(V4')^ 



(4.3) 



and the first-order Hamilton equations read 



= f = s-'n. 



H' = 

(54' 



leading to the decoupled second order equations 

+ = 0 ^ n" - ^H' - V^H = 0. 

In Fourier space the Hamiltonian (4.3) is given by 



1 






k ’ 



(4.4) 

(4.5) 

(4.6) 



where 4' r = 4^1 and H r = Ht. The equations of motion become 

— ^ k k 

^t = S-^U_j:, UL = -Se^>_j:, (4.7) 

where k = |fc|. The transformation 

n-^H- = fc4/,-, 4/,-^$,- = -fc-in,-, s^s = s-^ (4.8) 

leaves the Hamiltonian, Poisson brackets, and equations of motion un- 
changed. This symmetry of linear perturbation theory, and its physical 
consequences, was discussed in [42] under the name of S-duality, since it 
contains the usual strong-weak coupling (electric-magnetic) duality in the 
special case of gauge perturbations. 

In order to solve the perturbation equations, and to normalize the spec- 
trum, it is convenient to introduce the normalized (but no longer canonically 
conjugate) variables 4r, H, whose Fourier modes are defined by 

, Hfc = Hfe , (4.9) 

so that the Hamiltonian density takes the canonical form: 

^ = + ( 4 - 10 ) 

k 



Under S-duality, these new variables transform as the original ones. They 
satisfy the Schrodinger-like equations 



4-^ 



k 2 _ ^gl/2y,g-l/2 



^^ = 0, 



Hfc 



k.2 _ i^g-l/2y,gl/2 



Hfc = 0. 

(4.11) 
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The amplification of perturbations is typically associated with a transition 
from an inflationary phase in which the pump field is accelerated to a post- 
infiationary phase in which the pump field is decelerated or constant. In such 
a class of backgrounds, the “effective potentials”, and 

Un = , grow during the phase of accelerated evolution, and de- 

crease in the post-inflationary, decelerated epoch, vanishing asymptotically 
both for very early times, 77 ^ — 00 , and for very late times, 77 ^ -l-oo. 

The initial evolution of perturbations, for all modes with > |Vg<|, 

I bn I, may be described by the WKB-like approximate solutions of equa- 
tions (4.11) 



—i J d?7'(fc^— 

$fc(77) = '-o 

-i f dV(fe"-Vn)^''^ 

nfc( 7 ?) = k{k^-Vn)~^'^ e ”0 , (4.12) 

which we have normalized to a vacuum fluctuation, and where the extra 
factor of k in the solution for 11 ^ comes from consistency with the first 
order equations (4.7). We have ignored a possible relative phase in the 
solutions. Solutions (4.12) manifestly preserve the S-duality symmetry of 
the equations, since the potentials Uj., Vn get interchanged under S S~^. 

Let us now discuss two opposite regimes: 

• when the perturbation is deeply inside the horizon (fc/a 3> H) we find 
“adiabatic” behaviour, i.e. 

fc4>fc ~ , Ufc ~ ^ (4.13) 

implying, through (4.3), that the contribution to the Hamiltonian of 
modes inside the horizon stays constant; 

• when the perturbation is far outside the horizon (k/a <C H), it en- 
ters the so-called freeze-out regime in which and H stay constant 
(better have a constant solution, see [42]). Such a behaviour implies, 
again through (4.3), that the contribution of super-horizon modes to 
the Hamiltonian grows in time. If S' > 0, the growth of Ti. is due 
to T, while, for S < 0, it is due to H. In either case the growth 
is due to particle production in squeezed states [43], i.e. states in 
which one canonical variable is very sharply defined and the conjugate 
one is largely undetermined. Although, strictly speaking, quantum 
coherence is not lost, in practice the sub-fiuctuating variable cannot 
be measured with unlimited precision (coarse graining) and therefore 
entropy is produced (see Sect. 4.6). 
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It is not too hard to join the two extreme regimes mentioned above and to 
find the qualitative and quantitative features of the solutions. For lack of 
space we refer the reader to the original literature (see, e.g. [42]). 

The above considerations were very general. What is instead typical of 
the PBB scenario? There are at least two features that are quite unique to 
string cosmology: 

• pump fields, and in particular their contributions to the evolution 
equations (4.11), grow during PBB inflation, while they tend to decay 
in standard inflation; 

• the richer set of backgrounds and fluctuation present in string theory 
allows for the amplification of new kinds of perturbations. 

One can easily determine the pump fields for each one of the most interesting 
perturbations appearing in the PBB scenario. The result is: 

Gravity waves, dilaton : S = 

Heterotic gauge bosons : S = e~^ 

Kalb Ramond, axions : S = . (4.14) 

In the following subsections we shall briefly describe the characteristics of 
these four perturbations after their original vacuum fluctuations are ampli- 
fied by PBB inflation. For further details, see also [44]. 

4.2 Tensor perturbations: An observable cosmic gravitational radiation 

background (CGRB)? 

It is not surpising to find that, for tensor and dilaton perturbations, the 
pump field is nothing but the scale factor in the Einstein frame (oe = 
since, in this frame, the action for gravity and for the dilaton take 
the canonical form. The Einstein-frame scale factor corresponds to a collaps- 
ing Universe (see Sect. 3), hence to the decreasing pump field a-Eiv) ~ 
during DDL For scales that go outside the horizon during DDI, this im- 
plies [45] a Raileigh-Jeans-like spectrum, dU/dlogfc ~ k^, up to logarithmic 
corrections [45]. 

When the curvature scale reaches the string scale we expect DDI to 
end, and a high (string scale) curvature phase to follow, before the even- 
tual exit to the FRW phase takes place (see Sect. 5). Not much is known 
about the string phase, but, using some physical arguments as well as some 
quantitative estimates, it can be argued that such a phase will lead to co- 
pious GW production at frequencies corresponding to the string scale at 
the time of exit. After the transition to the FRW phase, all particle pro- 
duction switches off. This is why our GW spectrum has an end point that 
corresponds to the string/Planck scale at the beginning of the FRW phase. 
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If no inflation takes place after, the end-point frequency corresponds, today, 
to w = wi ~ 100 GHz. 

As illustrated in Figure 8, the GW spectrum can be rather flat below the 
end point, up to the frequency Wg, the last scale that went out of the horizon 
during DDL Further below Wg we get the above-mentioned steep spec- 
trum. It thus looks as if the best chances for the detection of our stochastic 
background lie precisely near Wg, where a kink (or knee) is expected. 




Fig. 8. 

Unfortunately, the position of the knee and the value of flow at that 
point depend on two background parameters that are, so far, difficult to 
predict. One corresponds to the duration of (better, the total red-shift 
during) the string phase, the other to the value of Ip /Ag at the end of DDI 
(hence to the value of the dilaton at that time). As shown in Figure 8, values 
of Ugw in the range of 10“® — 10“^ are possible in some regions of parameter 
space, which, according to some estimates of sensitivities [46] reported in 
the same figure for Wg ~ 10^ Hz, could be inside detection capabilities in 
the near future. 

The signal is predicted to consist of randomly distributed standing waves, 
a feature that has been argued [47] to further help detection. In any case, 
cross-correlation experiments are mandatory here in order to disentangle 
this stochastic signal from real noise. Sensitivities to a GGRB of this 
type have been estimated for a variety of two-detector combinations [46]. 
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A comprehensive review of GW experiments and of their relevance to the 
early Universe can be found in [48]. 



4.3 Dilaton perturbations 

Since the dilaton is, after all, the inflaton of PBB cosmology, its fluctu- 
ations are the most natural source of adiabatic scalar perturbations. We 
recall that, in standard cosmology, inflaton fluctuations naturally lead to 
a quasi scale-invariant, Harrison-Zeldovich (HZ) spectrum of adiabatic per- 
turbations, something highly desirable both to explain CMB anisotropy and 
for models of LSS formation. Can we get something similar from the dila- 
ton? The answer, unfortunately, is no! Let me spend a moment explaining 
why. 

Unlike tensor perturbations, which do not couple to the scalar held to 
linear order and are gauge-invariant by themselves, scalar perturbations are 
contained, a priori, in five functions defined by: 



ds^ = a^{ri)\—{l + 2^)dr]^ + {{l — 2'i>)Sij + didjE)dx^dx^ — 2diBdx^dri] 

= (t>o{v) + xi.'n,x) . (4.15) 

The five functions 4>, 4', H, A, y are not separately gauge-invariant. How- 
ever, the following “Bardeen” combinations are gauge-invariant: 

4 >b = ^+^[a{B-E')i , 

XGi = (4.16) 

a' 

Introducing the variable v = aycij the scalar held enters the quadratic 
action “canonically” i.e.: 



Ses{v) = - 



drjdiX 



..'2 



- (Vw)2 -h (z /z) 






(4.17) 



giving the evolution equation 

+ (k'^ - {z /z)^ Vk = 0 . (4.18) 

In the DDI background, z ^ a and thus the canonical scalar held obeys 
the same equation as the canonical graviton held, therefore giving identical 
spectra (as far as the dilaton remains massless, of course). This strongly 
suggests that adiabatic perturbations in PBB cosmology have a Raleigh- 
Jeans, rather than HZ, spectrum and that they are unsuitable for generating 
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CMBA or LSS. Before being sure of that, however, we have to analyse 
the scalar fluctuations of the metric itself in terms of the above-mentioned 
Bardeen potentials <1)b, 'I'b- 

A popular gauge (particularly advertised in [41]) is the so-called longi- 
tudinal gauge, defined hy B = E = Q, where 4 )b = and 'I'b = '!'• In this 
gauge one of the constraints simply reads 4> = 4', while a second constraint 
relates either one of them to v: 



where we have inserted the small k behaviour of Vk, which is identical to 
that of tensor perturbations. 

Unfortunately, equation (4.19) leads to very large fluctuations of 'I' = 4) 
at small krj, so large that one leaves the linear-perturbation regime for 
the expansion (4.15) of the metric much before the high-curvature scale is 
reached. Does this mean that the metric becomes very inhomogeneous? It 
would look to be the case... unless the growth of 'I' and 4> is in some way a 
gauge artefact. But how can it be a gauge artefact if 4' and 4> correspond, 
in this gauge, to the gauge-invariant Bardeen potentials? The answer to 
this question was provided in [49] . By going from the longitudinal gauge to 
an “off-diagonal” gauge with T = U = 0, or, even better, to one in which 
only 4> and E appear, one finds that perturbations of the metric remain 
small at all rj till Planckian/string-scale curvatures are reached. 

This is easy to see, for instance, in a gauge with 'I' = B = 0, where 
'I'b ~ {a' /a)E' . Clearly this gives E ~ and, since E enters the metric 
with two spatial derivatives, this implies that hij ~ (/c?7)^4'b, which is suffi- 
ciently small at small krj for linear perturbation theory to be valid. One can 
then look for physical effects of these scalar perturbations {e.g. for contribu- 
tions to CMBA) and And that they actually remain as small as the tensor 
contributions. In conclusion, once gauge artefacts are removed, it seems 
that adiabatic scalar perturbations, as well as their tensor counterparts, 
remain exceedingly small at large scales. 

On the other hand, the rather large yields at short scales also apply to 
dilatons. This allows for a possible source of scalar waves if the dilaton 
is very light. However, as recently discussed by Gasperini [50], it is very 
unlikely that such a signal will be observable, given the constraints on the 
dilaton mass due to tests of the equivalence principle (see Sect. 2). Other 
restrictions on the dilaton mass come from the possibility that their density 
may become overcritical and close the Universe. This and other possible 
interesting windows in parameter space are discussed in [17], and will not 
be reported in any detail here. 
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4.4 Gauge-field perturbations: Seeds for B^^x? 

In standard inflationary cosmology there is no ampliflcation of the vacuum 
fluctuations of gauge fields. This is a straightforward consequence of the fact 
that inflation makes the metric conformally flat, and of the decoupling of 
gauge fields from a conformally flat metric precisely in D = 3+1 dimensions. 

As a very general remark, apart from pathological solutions, the only 
background held that can amplify, through its cosmological variation, e.m. 
(more generally gauge-field) quantum fluctuations is the effective gauge cou- 
pling itself [51]. By its very nature, in the pre-Big Bang scenario the effective 
gauge coupling inflates together with space during the PBB phase. It is thus 
automatic that any efficient PBB inflation brings together a huge variation 
of the effective gauge coupling, and thus a very large ampliflcation of the 
primordial e.m. fluctuations [52]. This can possibly provide the long-sought 
for origin for the primordial seeds of the observed galactic magnetic fields. 

To be more quantitative, since the pump held for electromagnetic pertur- 
bations is the effective (four-dimensional) gauge coupling itself (see 
Eq. (4.14)), the total ampliflcation of e.m. perturbations on any given scale 
A is given by ao/c«ex) *-e. by the ratio of the fine structure constant now and 
the fine structure constant at the time of exit of the scale A during DDL 
It turns out [52] that, in order to produce sufficiently large seeds for the 
galactic magnetic fields, such a ratio has to be enormous for the galactic 
scale, i.e. about 10®®. Taken at face value, this would be a very strong 
indication in favour of the PBB scenario, more particularly of DDL Indeed, 
only in such a framework is it natural to expect that the effective gauge 
coupling grew during inflation by a factor whose logarithm is of the same 
order as the number of inflationary e-folds. 

Notice, however, that, unlike GW, e.m. perturbations interact quite 
considerably with the hot plasma of the early (post-Big Bang) Universe. 
Thus, converting the primordial seeds into those that may have existed 
at the protogalaxy formation epoch is by no means a trivial exercise (see, 
e.g. [53]). The question of whether or not the primordial seeds generated in 
PBB cosmology can evolve into the observed galactic magnetic fields thus 
remains, to this date, an unsolved, yet very interesting, problem. 

4.5 Axion perturbations: Seeds for CMBA and LSS? 

In four dimensions the curl of H^^p, is equivalent to a pseudoscalar 
field, the (KR) axion cr, through 

Hp,p = e^ ep.prd^a . (4.20) 

It is easy to see that, while the pump field for Bpi, is e~‘^, that for a 
is a} . Indeed their respective perturbations are related by the duality 
of perturbations discussed in Section 4.1. We can use either description 
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with identical physical results. Note that, while a and (j) worked in opposite 
directions for tensor and dilaton perturbations, generating strongly tilted 
(blue) spectra, the two work in the same direction for axions, so that spec- 
tra can be flat or even tilted towards large scales (red spectra) [54]. An 
interesting fact is that, unlike the GW spectrum, that of axions is very sen- 
sitive to the cosmological behaviour of internal dimensions during the DDI 
epoch. On one side, this makes the model less predictive. On the other, it 
tells us that axions represent a window over the multidimensional cosmol- 
ogy expected generically from string theories, which must live in more that 
four dimensions. Parametrizing the spectrum by: 

f^a.(fc)=(^) {k/k^.^r , (4.21) 

and considering the case of three non-compact and six compact dimensions 
with separate isotropic evolution, one finds: 



3 -k 3r2 - 2V3 -k 6r2 
“ l-k3r2 



(4.22) 



where 



1 Ue ^3 
2Ue 



(4.23) 



is a measure of the relative evolution of the internal and external volumes. 
Equations (4.22, 4.23) show that the axion spectrum becomes exactly HZ 
(z.e. scale-invariant) when r = 1, i.e. when all nine spatial dimensions of 
superstring theory evolve in a rather symmetric way [55] . In situations near 
this particularly symmetric one, axions are able to provide a new mechanism 
for generating large-scale CMBA and LSS. 

Calculation of the effect gives [56], for massless axions: 



l{l + l)Ci ^ 0{l) 



Mp 



(^O^max) 



P(^ -k o) 
P(; - a) ’ 



(4.24) 



where Ci are the usual coefficients of the multipole expansion of AT /T 



{AT/T{n) AT/T{ri')) = ^(2? -k l)G/T/(cos 6») , n-n' = cos0, (4.25) 

i 



and rjokmax ~ 10^°. In string theory, as repeatedly mentioned, we expect 
Hmax/Mp ~ Mg/Mp ~ 1/10, while the exponent a depends on the explicit 
PBB background with the above-mentioned HZ case corresponding to a = 
0. The standard tilt parameter n = rig {s for scalar) is given by n = 1 -k 2a 
and is found, by COBE [57], to lie between 0.9 and 1.5, corresponding to 
0 < a < 0.25 (a negative a leads to some theoretical problems). With these 
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inputs we can see that the correct normalization (C 2 ~ 10“^°) is reached for 
a ~ 0.2, which is just in the middle of the allowed range. In other words, 
unlike in standard inflation, we cannot predict the tilt, but when this is 
given, we can predict (again unlike in standard inflation) the normalization. 

With some extra work [58] one can compute the Ci in the acoustic- 
peak region adding vector and tensor contributions from the seeds. It turns 
out that the acoustic-peak structure is very sensitive to a, hence to the 
behaviour of the internal dimensions during the DDI phase. The above- 
mentioned value, a = 1, does not give peaks at all and, as such, looks ruled 
out by the data. Values of a in the range 0.3-0. 4 appear to be preferred 
(especially in the presence of a cosmological constant with IIa ~ 0.7). We 
saw, however, that the overall normalization was very sensitive to the value 
of a. For a in the 0.3-0. 4 range, the normalization is off (way too small) by 
many orders of magnitude. Therefore, if present indications are confirmed, 
as they seem to be from the recent release of the Boomerang 1997 data 
analysis [59], one will be forced to a /c-dependent a, meaning different phases 
in the evolution of internal dimensions during DDI. 

4.6 Heating up the Universe 

Before closing this section, I wish to recall how one sees the very origin 
of the hot Big Bang in this scenario. One can easily estimate the total 
energy stored in the quantum fluctuations, which were amplified by the 
pre-Big Bang backgrounds (for a discussion of generic perturbation spectra, 
see [55,60]. The result is, roughly, 

/^quantum ^ -^max ? 

where is the effective number of species that are amplified and i7max is 
the maximal curvature scale reached around t = 0. We have already argued 
that i7max ~ Ms = and we know that, in heterotic string theory, N^g 
is in the hundreds. Yet, this rather huge energy density is very far from 
critical, as long as the dilaton is still in the weak-coupling region, justifying 
our neglect of back-reaction effects. It is very tempting to assume [55] that, 
precisely when the dilaton reaches a value such that Pquantum is critical, 
the Universe will enter the radiation-dominated phase. This PBBB (PBB 
bootstrap) constraint gives, typically: 

~ 1/IVeff , (4.27) 

i.e. a value for the dilaton close to its present value. 

The entropy in these quantum fluctuations can also be estimated follow- 
ing some general results [61]. The result for the density of entropy S is, as 
expected, 

5 ~ N^gHl,^ . (4.28) 
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It is easy to check that, at the assumed time of exit given by (4.27), this 
entropy saturates recently proposed holography bounds. The discussion 
of such bounds is postponed to Section 5.7 since is has also interesting 
implications for the exit problem. 



5 How could it have stopped? 

We have argued that, generically, DDI, when considered at lowest order in 
derivatives and coupling, evolves towards a singularity of the Big Bang type. 
Similarly, at the same level of approximation, non-inflationary solutions of 
the FRW type emerge from a singularity. Matching these two branches in a 
smooth, non-singular way has become known as the (graceful) exit problem 
in string cosmology [62]. It is, undoubtedly, the most important theoretical 
problem the PBB scenario is facing today. 

Of course, one would not only like to know that a graceful exit does take 
place: one would also like to describe the transition between the two phases 
in a quantitative way. Achieving this goal would amount to nothing less 
than a full description of what replaces the Big Bang of standard cosmol- 
ogy in the PBB scenario. As mentioned in Section 1, this difficult problem is 
the analogue, in string cosmology, of the (still not fully solved) confinement 
problem of QCD. The exit problem is particularly hard because, by its very 
nature, and by the existing no-go theorems [62], it must occur, if at all, at 
large curvature and/or coupling and, because of fast time-dependence, must 
break (spontaneously) supersymmetry. The phenomenological predictions 
made in the previous section were based on the assumption that i) a grace- 
ful exit does take place; ii) sufficiently large scales are only affected by it 
kinematically, i.e. through an overall red-shift of all scales. 

In this section, after recalling some no-go theorems for the exit, we 
will review various proposals that circumvent those theorems starting from 
the mathematically simplest, but physically least realistic, proposals and 
ending with the physically favoured, but harder to analyse, suggestions. 
The latter proposals suggest possible lines along which a quantitative de- 
scription of the exit might eventually emerge. Needless to say, in spite of 
the many encouraging results, much work remains to be done: perhaps 
new techniques, and/or a deeper understanding of string theory in its non- 
perturbative regimes through the construction of the still largely unknown 
M-theory [10], need to be developed before a full quantitative description 
can be hoped for. 

I should also mention that there have been suggestions [63] that the BB 
singularity can be avoided if the DDI phase is highly anisotropic. While 
this is an interesting suggestion, with isotropization taking place later in 
the non-inflationary regime, we will stick here to the simplest case, in which 
DDI has already prepared a very homogeneous Universe before exit takes 
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place. This is why our discussion on the exit problem is limited to the case 
of homogeneous cosmologies. Also, for lack of space, I shall refer the reader 
to the literature for most of the details. 



5.1 No-go theorems 

Under some restrictive conditions [62] , it was shown that one cannot have a 
change of branch, i.e. that the Universe cannot make a permanent transi- 
tion from the inflationary pre-Big Bang to a FRW post-Big Bang solution. 
Perhaps the best way to convey the physical meaning of those theorems 
is in terms of the necessary conditions for exit recently given by Brustein 
and Madden [64]. These authors give necessary conditions for two subse- 
quent events to occur: firstly, a branch change in the string frame should 
take place: this imposes the violation of some energy conditions; secondly, 
a bounce should occur in the E-frame metric since, as we have seen in 
Section 3, DDI represents a collapse in the E-frame. This latter transition 
requires further violation of energy conditions. 

Before the reader gets too worried about these violations, I should point 
out that these refer to the equations of state satisfied by some “effective” 
sources, which include both higher-derivative and higher-loop corrections. 
It is well known that such sources generically do lead to violations of the 
standard energy conditions satisfied by normal matter or radiation-like clas- 
sical sources. 



5.2 Exit via a non-local V 

This is perhaps the simplest example of an exit. It was first discussed 
in [65]. The reason why this is not considered an appealing mechanism 
for the exit is that the potential it employs depends on ^ (instead of </>), in 
order to preserve SFD. By general covariance such a potential, if non-trivial, 
must be non-local. Unfortunately, there has been no convincing proposal 
to explain how such non-local potentials might arise within a superstring 
theory framework. 

5.3 Exit via Bij 

The antisymmetric KR field may lead to violation of the energy conditions 
and thus induce an exit. Some amusing examples were given in [66], where 
a non-trivial Bij is introduced through 0(d,d) transformations acting on a 
pure metric-dilaton cosmology of the type described in Section 2.5. It was 
found that these so-called “boosted” cosmologies were less singular than the 
original ones. In some cases they were even completely free of singularity 
and provided examples of exit, albeit in not-so-realistic situations. 
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It is tempting to speculate that this softening of singularities, due to a 
non-trivial Bij field, could be related to recent developments in the field of 
non-commutative geometry [67] induced by a Bij field. Work along these 
lines is in progress. 

5.4 Exit via quantum tunnelling 

Several groups [68] have attempted to describe the transition from the pre- 
to the post-Big Bang without modifying the low-energy tree-level effec- 
tive action, by exploiting the quantum cosmology approach based on the 
Wheeler-De Witt (WDW) equation. In references [68] an 0(<i, d)-invariant 
WDW equation was derived in the {(P + l)-dimensional mini-superspace 
consisting of a homogeneous Bianchi I metric, the antisymmetric tensor, 
and the dilaton. The 0{d,d) symmetry helps avoiding the ordering am- 
biguities which usually plague the WDW equation. For the time being 
only the mathematically simpler case of an 0(d, d)-invariant potential V {4>) 
has been analysed since, in that case, d^-conserved charges can be defined 
and the “radial” part of the WDW equation reduces to a one-dimensional 
Schroedinger equation for a scattering problem. 

It is amusing that, from such a point of view, the initial state of the 
Universe is described by a right-moving plane wave, which later encounters 
a potential, giving rise to both a transmitted and a reflected {i.e. left- 
moving) wave. The transmission coefficient gives the probability that the 
Universe ends up in the pre-Big-Bang singularity, while the reflection co- 
efficient gives the probability of a successful exit into the post-Big-Bang 
decelerating expansion. 

For certain forms of V {$) the wave is classically reflected and the WDW 
approach just confirms this expectation by giving a 100% probability for 
the exit. However, even when there is no classical exit, the probability of 
wave-reflection is non-zero because of quantum tunnelling. The quantum 
probability of a classically forbidden exit turns out to be exponentially sup- 
pressed in the coupling constant e"^, which is just fine. Unfortunately, it is 
also exponentially suppressed in the total volume of 3-space (in string units) 
after the pre-Big-Bang. Thus, only tiny regions of space have a reasonable 
chance to tunnel. 



5.5 Higher-derivative corrections 

While the examples of exit given in the previous subsections are theoretically 
interesting, they do look somewhat artificial and non-generic. In this and 
in the following subsection we shall describe two mechanisms for exit that 
involve very general properties of the lowest order solutions and of string 
theory. The present feeling is that, if graceful exit occurs, it should be maily 
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induced by some combination of higher-derivative and higher-loop effects. 
Let us start with the former. 

Toy examples have shown [69] that DDI can flow, thanks to higher- 
curvature corrections, towards a de-Sitter-like phase, z.e. into a phase of 
constant H (curvature) and constant 0. This phase is expected to last 
until loop corrections become important and give rise to a transition to a 
radiation-dominated phase (see the next subsection) . The idea is to justify 
the strong curvature transition from the dilatonic to the string phase by 
proving the existence of an exact de Sitter-like solution to the field equation, 
which acts as a late time attractor for the perturbative DDI branch. As 
shown in [69], the existence of such attractors depends on the existence of 
(non-trivial) solutions for a system of n algebraic equations in n unknowns. 
In general, we may expect a discrete number of solutions to exist. If at least 
one of them has some qualitative characteristics, it will act as a late-time 
attractor for solutions approaching DDI in the far past. An explicit example 
of this phenomenon was constructed in [69]. In this connection, it is worth 
mentioning that solutions connecting duality-related low-energy branches 
through a high-curvature CFT were already proposed in [70]. 

It was recently pointed out [71] that the reverse order of events is also 
possible. The coupling may become large before the curvature does. In 
this case, at least for some time, the low-energy limit of M-theory should 
be adequate: this limit is known [10] to give D = 11 supergravity and is 
therefore amenable to reliable study. It is likely, though not yet clear, that, 
also in this case, strong curvatures will have to be reached before the exit 
can be completed. 

5 . 6 Loop corrections and back reaction 

The idea here is to invoke the back reaction from particle production as the 
relevant mechanism. Since the back reaction is an 0{e'^a' H'^) correction, its 
effect is contained in one-loop 0{R^) contributions to the effective action. A 
recent calculation [72] shows that, indeed, loop corrections to DDI work in 
the right direction and become relevant precisely when expected according 
to the exit criterion (4.27). 

A class of such contributions was analysed some time ago by Antoniadis 
et al. [73] in the case of a spatially fiat (fc = 0) cosmology and by Easther and 
Maeda [74] in the case of a closed Universe {k = 1). Both groups find non- 
singular solutions to the loop-corrected field equations. However, neither 
group is actually able to obtain solutions that start in the dilaton-driven 
superinfiationary regime and later evolve through a branch change. 

More recently, several examples of full exit have been constructed [75]. 
Although they are based on a' and loop-corrected actions, which have not 
been derived from reliable calculations, they seem to indicate, at least, that 
the BM conditions for exit may turn out to be not just necessary but also 
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sufficient. It also appears [76] that exit occurs when the entropy bound 
becomes threatened by the entropy in the amplified/squeezed quantum fluc- 
tuations, as we shall now discuss. 

5 . 7 Entropy considerations 

Entropy-related considerations have recently led to model-independent ar- 
guments in favour of the occurrence of a graceful exit in string cosmology. 
As we shall see, those are physically quite close to the arguments based 
on back-reaction and loop corrections, which we have just discussed in the 
previous subsection. 

Almost twenty years ago Bekenstein [77] suggested that, for a limited 
gravity system of energy E, and whose size R is larger than its gravitational 
radius, R > Rg = 2GnA, entropy is bounded by S'beb: 

Sbeb = ER/n = RgR (5.1) 

Holography [78] suggests that maximal entropy is bounded by S'hol, 

-S'hol = (5.2) 

where A is the area of the space-like surface enclosing the region of space 
whose entropy we wish to bound. For systems of limited gravity, since 
R > Rg,A = R^ (5.1) implies the holography bound (5.2). 

Can these entropy bounds be applied to the whole Universe, i.e. to 
cosmology? A cosmological Universe is not a system of limited gravity, 
since its large-distance behaviour is determined by the gravitational effect 
of its matter content through Friedmann’s equation (2.4). Furthermore, the 
holography bound obviously fails for sufficiently large regions of space since, 
for a given temperature, entropy grows like R^ while area grows like R? . 
The generalization of entropy bounds to cosmology turned out to be subtle. 

In 1989, Bekenstein himself [79] gave a prescription for a cosmological 
extension by choosing R in equation (5.1) to be the particle horizon. Amus- 
ingly, he arrived at the conclusion that the bound is violated sufficiently 
near the Big-Bang singularity, implying that the latter is fake (if the bound 
is always valid). About a year ago, Fischler and Susskind (FS) [80] pro- 
posed a similar extension of the holographic bound to cosmology, arguing 
that the area of the particle horizon should bound entropy on the backward- 
looking light cone according to (5.2). It was soon realized, however, that 
the FS proposal requires modifications, since violations of it were found to 
occur in physically reasonable situations. An improvement of the FS bound 
applicable to light-like hypersurfaces was later made by Bousso [81]. 

Of more interest here are the attempts made at deriving cosmologi- 
cal entropy bounds on space-like hypersurfaces [82-86]. These identify the 
maximal size of a spatial region for which holography works: the Hubble 
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radius [82,83,85], the apparent horizon [84], or, finally, a so-called causal 
connection (Jeans) scale [86]. 

For our purposes here, we do not need to enter into the relative merits 
of these various proposals. Rather, we will only outline the physical idea 
behind them. Consider, inside a quasi-homogeneous Universe, a sphere 
of radius H~^ . We may consider “isolated” bodies, in the sense of refer- 
ence [77], fully contained in the sphere, i.e. with radius R < H~^ . For 
such systems, the usual BB holds and is saturated by a black hole of size R. 
We may consider next several black holes inside our Hubble volume, each 
carrying an entropy proportional to the square of its mass. If two, or more, 
of these black holes merge, their masses will add up, while the total entropy 
after the merging, being quadratic in the total mass, will exceed the sum of 
the initial entropies. In other words, in order to maximize entropy, it pays 
to form black holes as large as possible. 

Is there a limit to this process of entropy increase? The suggestion made 
in [82-86], which finds support in old results by several groups [87], is that 
a critical length of order H~^ is the upper limit on how large a classically 
stable black hole can be. If we accept this hypothesis, the upper bound 
on the entropy contained in a given region TZ of space will be given by the 
number of Hubble volumes in TZ, nn = V times the Bekenstein-Hawking 
entropy of a BH or radius H~^ , The two factors can be combined 

in the suggestive formula: 

S{TZ) < [ d^a; VhH = S'hb , (5.3) 

Jn 

where d^a; y/h is the volume of the space-like hypersurface whose entropy 
we wish to bound, and H differs from one proposal to another [82-86] , but 
is, roughly, of the order of the Hubble parameter. Actually, since H is pro- 
portional to the trace of the second fundamental form on the hypersurface, 
equation (5.3) reminds us of the boundary term that has to be added to the 
gravitational action in order to correctly derive Einstein’s equations from 
the usual variational principle. This shows that the bound (5.3) is generally 
covariant for H = H . It can also be written covariantly for the identification 
of H made in [86] . 

For the qualitative discussion that follows, let us simply take H = H 
and let us convert the bound to string-frame quantities, taking into account 
the relation between Ip and As, given in equation (2.3). We obtain [83]: 

S{TZ) < , (5.4) 

where we have fixed an arbitrary additive constant in the definition (2.6) 
of (j). Equation (5.4) thus connects very simply the entropy bound of a 
region of fixed comoving volume to the most important variables occurring 
in string cosmology (see, e.g., the phase diagram of Fig. 3). 
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An immediate application of the bound (5.4) was pointed out in [83]. 
Noting that the bound is initially saturated in the BDV picture [31] of 
collapse/inflation, the bound itself cannot decrease without a violation of 
the second law. This gives immediately: 



</> 




(5.5) 



It is easy to check that this inequality is obeyed, but just so, during DDI, 
in the sense that it holds with the equality sign. In other words, the HEB is 
saturated initially and throughout DDI in the BDV picture. The bound also 
turns out to give a physically acceptable value for the entropy of the Universe 
just after the Big Bang: a large entropy on the one hand (about 10®°); a 
small entropy for the total mass and size of the observable Universe on the 
other, as often pointed out by Penrose [88]. Thus, PBB cosmology neatly 
explains why the Universe, at the Big Bang, looks so fine-tuned (without 
being so) and provides a natural arrow of time in the direction of higher 
entropy [83]. 

What happens in the mysterious string phase, where we are desperately 
short of reliable techniques? It is quite clear that equation (5.5) does not 
allow H to reach saturation {H = 0) in the first quadrant of Figure 3 since 
(j) > Q there. Instead, saturation of H in the second quadrant (where </> < 0) 
is perfectly all right. But this implies having attained the sought for branch 
change! 

Let us finally look at the loop corrections. Since, physically, these corre- 
spond to taking into account the back-reaction from particle production, we 
may check when the entropy in the cosmologically produced particles starts 
to threaten the bound. As discussed in Section 4.6, the entropy density 
in quantum fluctuations is given by cr ~ NgsH^, which equals the bound 
ctheb ~ Hlp^ precisely when IpH^Nes = 0(1). But, as already pointed 
out, this is just the line on which the energy density in quantum fluctu- 
ations becomes critical (see Eq. (4.27)) and where, according to [69], the 
back-reaction becomes 0(1). Similar conclusions are reached by applying 
generalized second law arguments [89] . 

The picture that finally emerges from all these considerations is best 
illustrated with reference to the diagram of Figure 4. Two lines are shown, 
representing boundaries for the possible evolution. The horizontal bound- 
ary is forced upon by the large-curvature corrections, while the tilted line 
in the first quadrant corresponds to the equation IpH^Nes = 0(1) that we 
have just discussed. Amusingly, this line was also suggested by Maggiore 
and Riotto [71] as a boundary beyond which copious production of 0-branes 
would set in. Thus, depending on initial conditions, the PBB bubble cor- 
responding to our Universe would hit first either the high-curvature or the 
large-entropy boundary and initiate an exit phase. Hopefully, a universal 
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late-time attractor will emerge guiding the evolution into the FRW phase 
of standard cosmology (shown as a vertical line in Fig. 10). 

Needless to say, all this has to be considered, at best, as having heuristic 
value. If taken seriously, it would suggest that the Universe will never enter 
the strong-coupling, strong-curvature regime, where the largely unknown 
M-theory should be used. The low-energy limit of the latter (the much 
better understood 11-D supergravity) could suffice to deal with the fun- 
damental exit problem of string cosmology. We refer to the literature for 
several other attempts at M-cosmology [90]. 

6 Outlook 

The outlook for the pre-Big Bang scenario, as formulated at present, is not 
necessarily an optimistic one. I am not sure I would bet a lot of money 
on it being right! But this is not really the issue. We have to remember 
that the PBB scenario is a top-down approach to cosmology. As stressed 
in the introduction, it would be quite a miracle if the correct model could 
be guessed without extensive feed-back from the data. The good news here 
is that new data are coming in all the time, and will continue to do so with 
more and more precision in the coming years! 

Rather, we should draw some lessons from this new attempt at very early 
cosmology, whether it succeeds or it fails. As I can see, the main lessons to 
be drawn are the following: 

• our Universe did not have to emerge, together with space and time, 
from a singularity; in string theory, the singularity should be fake, 
either because it is tamed by finite-string-size effects, or because it 
simply signals the need for new degrees of freedom in the description 
of physics at very short distances; 

• because string theory is an extension of GR, inflation is possible in 
that context even in the absence of potential energy (z.e. of an effective 
cosmological constant); actually, inflation is very natural and easy to 
achieve, being a consequence of the duality symmetries of the string- 
cosmology equations; 

• inflation in string cosmology can be related, mathematically, to the 
problem of gravitational collapse in GR; as such, it is a generic phe- 
nomenon, once the assumption of asymptotic past triviality is made; 
furthermore, the curvature scale and the coupling at the onset of PBB 
inflation are arbitrary classical moduli; 

• the Universe did not have to start hot! A hot Universe can emerge 
from a cold one thanks to quantum particle production in inflationary 
backgrounds; 
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• PBB cosmology predicts a rich spectrum of perturbations with 
different spectra depending on each perturbation’s “pump” field and 
on its evolution in the PBB era; observable relics of these perturba- 
tions may serve as a window on physics in the pre-bangian Universe 
all the way down to the string/Planck scale; 

• the simplest PBB models either predict too small perturbations at 
large scales, or a spectrum of isocurvature perturbations which may 
be already “experimentally challenged” (as Rocky Kolb would kindly 
say); 

• the exit problem still remains the hardest theoretical challenge to the 
whole idea of PBB cosmology; 

• Hopefully, the combination of the above-mentioned experimental and 
theoretical challenges will be able to tell us whether the PBB idea is 
just doomed, or whether parts of it should be kept while searching for 
a better scenario; it should also suggest new avenues for physics-driven 
research in string/M-theory; 

• last but not least, the PBB idea has taught us that we need not lock 
ourselves into preconceived ideas in cosmology (cf. “the Big Bang is 
the beginning of time” , “inflation needs a scalar potential”); rather, we 
should contemplate as wide a range of theoretically sound possibilities 
as we can in order for Nature to choose, at best, one of them. 
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Notes added in proofs 

Since these lectures were written up there have been (at least) two new 
developments worth of mention: 

1. it has been argued [91] that, when PBB cosmology is considered 
in the presence of all the antisymmetric fields (forms) occurring in 
all known superstring/M-theories, BKL behaviour [24] becomes once 
more generic. This could mean that the high curvature/large coupling 
regimes are approached through a PBB phase of BKL oscillations. The 
implications of this claim for PBB cosmology are still under study; 
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2. an explicit, analytic example of collapse/inflation from APT data has 
been constructed [92] in the case of colliding planar gravitational and 
dilatonic waves. It allows to connect analytically the initial data to 
the Kasner exponents near the singularity and to relate the duration 
of inflation to the strength of the initial waves. The explicit findings 
appear to confirm the validity of the collapse/inflation picture pro- 
posed in [31]. This work has been generalized to arbitrary dimensions 
of space and to waves containing also the B^i, held [93] . 
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